clusterpunchclient.pm - API to the clusterpunch UDP server
Using the clusterpunchclient you can easily request capacity and resource measurements from any nodes running the clusterpunch server. The clusterpunchclient module is responsible for organized conversation with the servers and formatting and displaying any responses.
You can benchmark all the hosts responding to a broadcast address using special benchmark commands. The command string, described below, should contain the functions that are implemented in the clusterpuch server architecture. The ClusterPunch call returns a hash of hashes, keyed by hostname, containing the hashed response of each host.
use clusterpunchclient; my %RESPONSE = ClusterPunch( host=>"10.1.2.255", port=>8095, command=>"benchio(13000,2);benchcpu(4e5);benchmem(1e6,30);debug", timeout=>2, debug=>0 ); CollateData(response=>\%RESPONSE);
In this example, the command ``mhz'' is sent to each clusterpunch server and the response hash is sorted based on the returned value.
use clusterpunchclient; my %RESPONSE = ClusterPunch( host=>"10.1.2.255", port=>8095, command=>"mhz", timeout=>2, debug=>0 ); foreach my $host (sort {$RESPONSE{$a}->{mhz} <=> $RESPONSE{$b}->{mhz}} keys %RESPONSE) { print "$host $RESPONSE{$host}->{mhz}\n"; }
This is part of the clusterpunch system. Each node that you wish to monitor must be running the clusterpunchserver daemon, documented in the clusterpunchserver. Once the server is running, you can poll it and send commands using the API documented in clusterpunchclient.pm, or through the various utilities, such as this one.
The clusterpunchclient API is very straightforward. You call the ClusterPunch function with the appropriate host, port, timeout and command and wait for the response, returned to you as a hash keyed by hostnames.
A number of commands are supported. Each command directs the server to carry out either a system resource lookup (e.g. free memory), system state (e.g. number of users logged in) or benchmark (e.g. generate N random numbers). The response of each host is returned as a hash reference, keyed by the resource (e.g. cpu, io, nrunning, etc).
The command must be of the form
cmd1([args]);cmd2([args]);...;cmdN([args])
where cmdN is a command understood by the server and [args] is a csv list of arguments appropriate for the command. The following commands are understood
Immediately shuts down the server and terminates the process. This is very useful for shutting down the monitor remotely and is implemented in the clusterpunch.shutdown utility.
Sets the supplied flags. No flags are current supported.
Benchmarks the IO subsystem by writing a perscribed number of kbytes from /dev/zero to a random file in the /tmp partition.
kbsize size of tmp file in kb iter number of iterations of dd
Benchmarks the memory subsystem by allocating and deallocating an array.
arayelem number of array elements iter number of iterations of setting/clearing the array
Benchamarks the CPU subsystem by generating random numbers.
iter number of random numbers to generate
Returns the n-minute load average. n=1,5,15 (default n=1)
Returns the number of users on the system (not necessarily unique)
Returns the number of open files on the system
Returns the number of *running* processes =item mem()
Returns the amount of free memory in MB
The structure of the hash returned by ClusterPunch has the format
%RESPONSE = (node1name=>{response1hash}, node2name=>{response2hash}, ..., node2name=>{response3hash});
where responsehash has the format
$response1hash = {'live'=>1, 'quantity'=>value, ..., 'quantity'=>value}
where 'quantity' can be one of nusers, load, cpu, io, mflops, memfree, all, lsof, nrunning, mem, mhz. If a server responds, it always sets the 'live' key to 1. This way, you can actually pass an empty command ``'' to rapidly poll live servers.
Below are examples of various strings that can be passed to the command
key. Please use these cautiously. If you pass ridiculous arguments, you can
effectively shut down your cluster. There is no argument checking at the
clusterpunch server level, so you can actually cause your nodes to start
swapping and die if benchmem()
is abused.
Generate 1,000,000 random numbers. This takes about 2.5 seconds on a 1GHz PIII chip. You can extend this test to last longer if you want to sample the nodes more finely.
Write 5 25MB files from /dev/zero to /tmp. You should be aware that most kernels implement a buffer system which speeds up the writing of smaller files. You may hit this somewhere around 40MB. For example, for our nodes writing a 30MB file takes about 0.3 seconds while writing a 70MB file takes about 4 seconds.
clusterpunchserver, clusterpunch.start, clusterpunch.shutdown
clusterpunchclient.pm
benchdriver, clusterbench, clusterlogin, clusternodecount, clustersnapshot, clusterwebimg
Martin Krzywinski (martink@bcgsc.ca)