NAME

clusterpunchclient.pm - API to the clusterpunch UDP server


SYNOPSIS

Using the clusterpunchclient you can easily request capacity and resource measurements from any nodes running the clusterpunch server. The clusterpunchclient module is responsible for organized conversation with the servers and formatting and displaying any responses.


Benchmarking

You can benchmark all the hosts responding to a broadcast address using special benchmark commands. The command string, described below, should contain the functions that are implemented in the clusterpuch server architecture. The ClusterPunch call returns a hash of hashes, keyed by hostname, containing the hashed response of each host.

 use clusterpunchclient;
 my %RESPONSE = ClusterPunch(
                            host=>"10.1.2.255",
                            port=>8095,
                            command=>"benchio(13000,2);benchcpu(4e5);benchmem(1e6,30);debug",
                            timeout=>2,
                            debug=>0
                            );
 CollateData(response=>\%RESPONSE);


Retrieving System Information

In this example, the command ``mhz'' is sent to each clusterpunch server and the response hash is sorted based on the returned value.

 use clusterpunchclient;
 my %RESPONSE = ClusterPunch(
                            host=>"10.1.2.255",
                            port=>8095,
                            command=>"mhz",
                            timeout=>2,
                            debug=>0
                            );
 foreach my $host (sort {$RESPONSE{$a}->{mhz} <=> $RESPONSE{$b}->{mhz}} keys %RESPONSE) {
   print "$host $RESPONSE{$host}->{mhz}\n";
 }


DESCRIPTION

This is part of the clusterpunch system. Each node that you wish to monitor must be running the clusterpunchserver daemon, documented in the clusterpunchserver. Once the server is running, you can poll it and send commands using the API documented in clusterpunchclient.pm, or through the various utilities, such as this one.

The clusterpunchclient API is very straightforward. You call the ClusterPunch function with the appropriate host, port, timeout and command and wait for the response, returned to you as a hash keyed by hostnames.


Commands

A number of commands are supported. Each command directs the server to carry out either a system resource lookup (e.g. free memory), system state (e.g. number of users logged in) or benchmark (e.g. generate N random numbers). The response of each host is returned as a hash reference, keyed by the resource (e.g. cpu, io, nrunning, etc).

The command must be of the form

 cmd1([args]);cmd2([args]);...;cmdN([args])

where cmdN is a command understood by the server and [args] is a csv list of arguments appropriate for the command. The following commands are understood

shutdown()

Immediately shuts down the server and terminates the process. This is very useful for shutting down the monitor remotely and is implemented in the clusterpunch.shutdown utility.

setflag(flag1,flag2,...)

Sets the supplied flags. No flags are current supported.

benchio(kbsize,iter)

Benchmarks the IO subsystem by writing a perscribed number of kbytes from /dev/zero to a random file in the /tmp partition.

 kbsize   size of tmp file in kb
 iter     number of iterations of dd 
benchmem(arrayelem,iter)

Benchmarks the memory subsystem by allocating and deallocating an array.

 arayelem  number of array elements
 iter      number of iterations of setting/clearing the array
benchcpu(iter)

Benchamarks the CPU subsystem by generating random numbers.

 iter      number of random numbers to generate
load(n)

 Returns the n-minute load average. n=1,5,15 (default n=1)
nusers()

 Returns the number of users on the system (not necessarily unique)
lsof()

 Returns the number of open files on the system
nrunning()

 Returns the number of *running* processes
  
=item mem()

 Returns the amount of free memory in MB


Server Response

The structure of the hash returned by ClusterPunch has the format

    %RESPONSE = (node1name=>{response1hash},
                 node2name=>{response2hash},
                 ...,
                 node2name=>{response3hash});

where responsehash has the format

    $response1hash = {'live'=>1,
                      'quantity'=>value,
                      ...,
                      'quantity'=>value}

where 'quantity' can be one of nusers, load, cpu, io, mflops, memfree, all, lsof, nrunning, mem, mhz. If a server responds, it always sets the 'live' key to 1. This way, you can actually pass an empty command ``'' to rapidly poll live servers.


EXAMPLES

Below are examples of various strings that can be passed to the command key. Please use these cautiously. If you pass ridiculous arguments, you can effectively shut down your cluster. There is no argument checking at the clusterpunch server level, so you can actually cause your nodes to start swapping and die if benchmem() is abused.


benchcpu(1e6)

Generate 1,000,000 random numbers. This takes about 2.5 seconds on a 1GHz PIII chip. You can extend this test to last longer if you want to sample the nodes more finely.


benchio(25000,5)

Write 5 25MB files from /dev/zero to /tmp. You should be aware that most kernels implement a buffer system which speeds up the writing of smaller files. You may hit this somewhere around 40MB. For example, for our nodes writing a 30MB file takes about 0.3 seconds while writing a 70MB file takes about 4 seconds.


SEE ALSO


Daemons

clusterpunchserver, clusterpunch.start, clusterpunch.shutdown


API

clusterpunchclient.pm


Utilities

benchdriver, clusterbench, clusterlogin, clusternodecount, clustersnapshot, clusterwebimg


AUTHOR

Martin Krzywinski (martink@bcgsc.ca)