- 5 . 1 CCS: Starting a Server
- 5 . 2 CCS: Client-Side
- 5 . 3 CCS: Server-Side
- 5 . 4 CCS: system handlers
- 5 . 5 CCS: network protocol
- 5 . 6 CCS: Authentication
The CCS module is split into two parts- client and server. The server side is the interface used by a Converse program; the client side is used by arbitrary (non-Converse ) programs. The following sections describe both these parts.
A CCS client accesses a running Converse program by talking to
, which receives the CCS requests and relays them
to the appropriate processor. The
for netlrts- versions, and is the first processor for all other versions.
charmrun also accepts the CCS options:
charmrun pgmname +pN charmrun-opts pgm-opts
: open a CCS server on any TCP port number
: open the given TCP port as a CCS server
: accept authenticated queries
As the parallel job starts, it will print a line giving the IP address and TCP port number of the new CCS server. The format is: ``ccs: Server IP = , Server port = $'', where is a dotted decimal version of the server IP address, and is the decimal port number.
A CCS client connects to a CCS server, asks a server PE to execute a pre-registered handler, and receives the response data. The CCS client may be written in any language (see CCS network protocol, below), but a C interface (files ``ccs-client.c'' and ``ccs-client.h'') and Java interface (file ``CcsServer.java'') are available in the charm include directory.
The C routines use the skt_abort error-reporting strategy; see ``sockRoutines.h'' for details. The C client API is:
void CcsConnect(CcsServer *svr, char *host, int port);
Connect to the given CCS server. svr points to a pre-allocated CcsServer structure.
void CcsConnectIp(CcsServer *svr, int ip, int port);
As above, but a numeric IP is specified.
int CcsNumNodes(CcsServer *svr);
int CcsNumPes(CcsServer *svr);
int CcsNodeFirst(CcsServer *svr, int node);
int CcsNodeSize(CcsServer *svr,int node);
These functions return information about the parallel machine; they are equivalent to the Converse calls CmiNumNodes, CmiNumPes, CmiNodeFirst, and CmiNodeSize.
void CcsSendRequest(CcsServer *svr, char *hdlrID, int pe,
unsigned int size, const char *msg);
Ask the server to execute the handler hdlrID on the given processor. The handler is passed the given data as a message. The data may be in any desired format (including binary).
int CcsSendBroadcastRequest(CcsServer *svr, const char *hdlrID,
int size, const void *msg);
As CcsSendRequest, only that the handler hdlrID is invoked on all processors.
int CcsSendMulticastRequest(CcsServer *svr, const char *hdlrID, int npes,
int *pes, int size, const void *msg);
As CcsSendRequest, only that the handler hdlrID is invoked on the processors specified in the array pes (of size npes).
int CcsRecvResponse(CcsServer *svr,
unsigned int maxsize, char *recvBuffer, int timeout);
Receive a response to the previous request in-place. Timeout gives the number of seconds to wait before returning 0; otherwise the number of bytes received is returned.
int CcsRecvResponseMsg(CcsServer *svr,
unsigned int *retSize,char **newBuf, int timeout);
As above, but receive a variable-length response. The returned buffer must be free()'d after use.
int CcsProbe(CcsServer *svr);
Return 1 if a response is available; otherwise 0.
void CcsFinalize(CcsServer *svr);
Closes connection and releases server.
The Java routines throw an IOException on network errors. Use javadoc on CcsServer.java for the interface, which mirrors the C version above.
Once a request arrives on a CCS server socket, the CCS server runtime looks up the appropriate handler and calls it. If no handler is found, the runtime prints a diagnostic and ignores the message.
CCS calls its handlers in the usual Converse fashion- the request data is passed as a newly-allocated message, and the actual user data begins CmiMsgHeaderSizeBytes into the message. The handler is responsible for CmiFree'ing the passed-in message.
The interface for the server side of CCS is included in ``converse.h''; if CCS is disabled (in conv-mach.h), all CCS routines become macros returning 0.
The handler registration interface is:
void CcsUseHandler(char *id, int hdlr);
int CcsRegisterHandler(char *id, CmiHandler fn);
Associate this handler ID string with this function. hdlr is a Converse handler index; fn is a function pointer. The ID string cannot be more than 32 characters, including the terminating NULL.
After a handler has been registered to CCS, the user can also setup a merging function. This function will be passed in to CmiReduce to combine replies to multicast and broadcast requests.
void CcsSetMergeFn(const char *name, CmiReduceMergeFn newMerge);
Associate the given merge function to the CCS identified by id. This will be used for CCS request received as broadcast or multicast.
These calls can be used from within a CCS handler:
Return 1 if CCS routines are available (from conv-mach.h). This routine does not determine if a CCS server port is actually open.
Return 1 if this handler was called via CCS; 0 if it was called as the result of a normal Converse message.
void CcsCallerId(skt_ip_t *pip, unsigned int *pport);
Return the IP address and TCP port number of the CCS client that invoked this method. Can only be called from a CCS handler invoked remotely.
void CcsSendReply(int size, const void *reply);
Send the given data back to the client as a reply. Can only be called from a CCS handler invoked remotely. In case of broadcast or multicast CCS requests, the handlers in all processors involved must call this function.
Allows a CCS reply to be delayed until after the handler has completed. Returns a token used below.
void CcsSendDelayedReply(CcsDelayedReply d,int size, const void *reply);
Send a CCS reply for the given request. Unlike CcsSendReply, can be invoked from any handler on any processor.
ccs_getinfo Takes an empty message, responds with information about the parallel job. The response is in the form of network byte order (big-endian) 4-byte integers: first the number of parallel nodes, then the number of processors on each node. This handler is invoked by the client routine CcsConnect.
Allows a client to be notified when a parallel run exits (for any reason).
Takes one network byte order (big-endian) 4-byte integer: a TCP port
number. The runtime writes ``die
n'' to this port before exiting. There is no response data.
perf_monitor Takes an empty message, responds (after a delay) with performance data. When CMK_WEB_MODE is enabled in conv-mach.h, the runtime system collects performance data. Every 2 seconds, this data is collected on processor 0 and sent to any clients that have invoked perf_monitor on processor 0. The data is returned in ASCII format with the leading string "perf", and for each processor the current load (in percent) and scheduler message queue length (in messages). Thus a heavily loaded, two-processor system might reply with the string ``perf 98 148230 100 385401''.
A CCS request arrives as a new TCP connection to the CCS server port. The client speaks first, sending a request header and then the request data. The server then sends the response header and response data, and closes the connection. Numbers are sent as network byte order (big-endian) 4-byte integers- network binary integers.
The request header has three fields: the number of bytes of request data, the (0-based) destination processor number, and the CCS handler identifier string. The byte count and processor are network binary integers (4 bytes each), the CCS handler ID is zero-terminated ASCII text (32 bytes); for a total request header length of 40 bytes. The remaining request data is passed directly to the CCS handler.
The response header consists of a single network binary integer- the length in bytes of the response data to follow. The header is thus 4 bytes long. If there is no response data, this field has value 0.
, passed to '++server-auth', is a configuration file that enables authentication and describes the authentication to perform.
The configuration file is line-oriented ASCII text, consisting of security level / key pairs. The security level is an integer from 0 (the default) to 255. Any security levels not listed in the file are disallowed.
The key is the 128-bit secret key used to authenticate CCS clients for that security level. It is either up to 32 hexadecimal digits of key data or the string "OTP". "OTP" stands for One Time Pad, which will generate a random key when the server is started. This key is printed out at job startup with the format "CCS_OTP_KEY Level key: " where is the security level in decimal and is 32 hexadecimal digits of key data.
For example, a valid CCS authentication file might consist of the single line "0 OTP", indicating that the default security level 0 requires a randomly generated key. All other security levels are disallowed.