Subsections

2 . Machine Interface and Scheduler

This chapter describes two of Converse 's modules: the CMI, and the CSD. Together, they serve to transmit messages and schedule the delivery of messages. First, we describe the machine model assumed by Converse .


2 . 1 Machine Model

Converse treats the parallel machine as a collection of nodes , where each node is comprised of a number of processors that share memory In some cases, the number of processors per node may be exactly one (e.g. Distributed memory multicomputers such as IBM SP.) In addition, each of the processors may have multiple threads running on them which share code and data but have different stacks. Functions and macros are provided for handling shared memory across processors and querying node information. These are discussed in Section  2.13 .


2 . 2 Defining Handler Numbers

When a message arrives at a processor, it triggers the execution of a handler function , not unlike a UNIX signal handler. The handler function receives, as an argument, a pointer to the message. The message itself specifies which handler function is to be called when the message arrives. Messages are contiguous sequences of bytes. The message has two parts: the header, and the data. The data may contain anything you like. The header contains a handler number , which specifies which handler function is to be executed when the message arrives. Before you can send a message, you have to define the handler numbers. Converse maintains a table mapping handler numbers to function pointers. Each processor has its own copy of the mapping. There is a caution associated with this approach: it is the user's responsibility to ensure that all processors have identical mappings. This is easy to do, nonetheless, and the user must be aware that this is (usually) required.

The following functions are provided to define the handler numbers:

typedef void (*CmiHandler)(void *)
Functions that handle Converse messages must be of this type.

int CmiRegisterHandler(CmiHandler h)
This represents the standard technique for associating numbers with functions. To use this technique, the Converse user registers each of his functions, one by one, using CmiRegisterHandler . One must register exactly the same functions in exactly the same order on all processors. The system assigns monotonically increasing numbers to the functions, the same numbers on all processors. This insures global consistency. CmiRegisterHandler returns the number which was chosen for the function being registered.

int CmiRegisterHandlerGlobal(CmiHandler h)
This represents a second registration technique. The Converse user registers his functions on processor zero, using CmiRegisterHandlerGlobal . The Converse user is then responsible for broadcasting those handler numbers to other processors, and installing them using CmiNumberHandler below. The user should take care not to invoke those handlers until they are fully installed.

int CmiRegisterHandlerLocal(CmiHandler h)
This function is used when one wishes to register functions in a manner that is not consistent across processors. This function chooses a locally-meaningful number for the function, and records it locally. No attempt is made to ensure consistency across processors.

void CmiNumberHandler(int n, CmiHandler h)
Forces the system to associate the specified handler number n with the specified handler function h. If the function number n was previously mapped to some other function, that old mapping is forgotten. The mapping that this function creates is local to the current processor. CmiNumberHandler can be useful in combination with CmiRegisterGlobalHandler . It can also be used to implement user-defined numbering schemes: such schemes should keep in mind that the size of the table that holds the mapping is proportional to the largest handler number -- do not use big numbers!

( Note: Of the three registration methods, the CmiRegisterHandler method is by far the simplest, and is strongly encouraged. The others are primarily to ease the porting of systems that already use similar registration techniques. One may use all three registration methods in a program. The system guarantees that no numbering conflicts will occur as a result of this combination.)


2 . 3 Writing Handler Functions

A message handler function is just a C function that accepts a void pointer (to a message buffer) as an argument, and returns nothing. The handler may use the message buffer for any purpose, but is responsible for eventually deleting the message using CmiFree.

2 . 4 Building Messages

To send a message, one first creates a buffer to hold the message. The buffer must be large enough to hold the header and the data. The buffer can be in any kind of memory: it could be a local variable, it could be a global, it could be allocated with malloc , and finally, it could be allocated with CmiAlloc . The Converse user fills the buffer with the message data. One puts a handler number in the message, thereby specifying which handler function the message should trigger when it arrives. Finally, one uses a message-transmission function to send the message.

The following functions are provided to help build message buffers:

void *CmiAlloc(int size)
Allocates memory of size size in heap and returns pointer to the usable space. There are some message-sending functions that accept only message buffers that were allocated with CmiAlloc . Thus, this is the preferred way to allocate message buffers. The returned pointer point to the message header, the user data will follow it. See CmiMsgHeaderSizeBytes for this.

void CmiFree(void *ptr)
This function frees the memory pointed to by ptr. ptr should be a pointer that was previously returned by CmiAlloc .

#define CmiMsgHeaderSizeBytes
This constant contains the size of the message header. When one allocates a message buffer, one must set aside enough space for the header and the data. This macro helps you to do so. For example, if one want to allocate an array of 100 int, he should call the function this way: ``char *myMsg = CmiAlloc(100*sizeof(int) + CmiMsgHeaderSizeBytes)''

void CmiSetHandler(int *MessageBuffer, int HandlerId)
This macro sets the handler number of a message to HandlerId.

int CmiGetHandler(int *MessageBuffer)
This call returns the handler of a message in the form of a handler number.

CmiHandler CmiGetHandlerFunction(int *MessageBuffer)
This call returns the handler of a message in the form of a function pointer.

2 . 5 Sending Messages

The following functions allow you to send messages. Our model is that the data starts out in the message buffer, and from there gets transferred ``into the network''. The data stays ``in the network'' for a while, and eventually appears on the target processor. Using that model, each of these send-functions is a device that transfers data into the network. None of these functions wait for the data to be delivered.

On some machines, the network accepts data rather slowly. We don't want the process to sit idle, waiting for the network to accept the data. So, we provide several variations on each send function:

void CmiSyncSend(unsigned int destPE, unsigned int size, void *msg)
Sends msg of size size bytes to processor destPE. When it returns, you may reuse the message buffer.

void CmiSyncNodeSend(unsigned int destNode, unsigned int size, void *msg)
Sends msg of size size bytes to node destNode. When it returns, you may reuse the message buffer.

void CmiSyncSendAndFree(unsigned int destPE, unsigned int size, void *msg)
Sends msg of size size bytes to processor destPE. When it returns, the message buffer has been freed using CmiFree .

void CmiSyncNodeSendAndFree(unsigned int destNode, unsigned int size, void *msg)
Sends msg of size size bytes to node destNode. When it returns, the message buffer has been freed using CmiFree .

CmiCommHandle CmiAsyncSend(unsigned int destPE, unsigned int size, void *msg)
Sends msg of size size bytes to processor destPE. It returns a communication handle which can be tested using CmiAsyncMsgSent : when this returns true, you may reuse the message buffer. If the returned communication handle is 0, message buffer can be reused immediately, thus saving a call to CmiAsyncMsgSent .

CmiCommHandle CmiAsyncNodeSend(unsigned int destNode, unsigned int size, void *msg)
Sends msg of size size bytes to node destNode. It returns a communication handle which can be tested using CmiAsyncMsgSent : when this returns true, you may reuse the message buffer. If the returned communication handle is 0, message buffer can be reused immediately, thus saving a call to CmiAsyncMsgSent .

void CmiSyncVectorSend(int destPE, int len, int sizes[], char *msgComps[])
Concatenates several pieces of data and sends them to processor destPE. The data consists of len pieces residing in different areas of memory, which are logically concatenated. The msgComps array contains pointers to the pieces; the size of msgComps[i] is taken from sizes[i]. When it returns, sizes, msgComps and the message components specified in msgComps can be immediately reused.

void CmiSyncVectorSendAndFree(int destPE, int len, int sizes[], char *msgComps[])
Concatenates several pieces of data and sends them to processor destPE. The data consists of len pieces residing in different areas of memory, which are logically concatenated. The msgComps array contains pointers to the pieces; the size of msgComps[i] is taken from sizes[i]. The message components specified in msgComps are CmiFree d by this function therefore, they should be dynamically allocated using CmiAlloc . However, the sizes and msgComps array themselves are not freed.

CmiCommHandle CmiAsyncVectorSend(int destPE, int len, int sizes[], char *msgComps[])
Concatenates several pieces of data and sends them to processor destPE. The data consists of len pieces residing in different areas of memory, which are logically concatenated. The msgComps array contains pointers to the pieces; the size of msgComps[i] is taken from sizes[i]. The individual pieces of data as well as the arrays sizes and msgComps should not be overwritten or freed before the communication is complete. This function returns a communication handle which can be tested using CmiAsyncMsgSent : when this returns true, the input parameters can be reused. If the returned communication handle is 0, message buffer can be reused immediately, thus saving a call to CmiAsyncMsgSent .

int CmiAsyncMsgSent(CmiCommHandle handle)
Returns true if the communication specified by the given CmiCommHandle has proceeded to the point where the message buffer can be reused.

void CmiReleaseCommHandle(CmiCommHandle handle)
Releases the communication handle handle and associated resources. It does not free the message buffer.

void CmiMultipleSend(unsigned int destPE, int len, int sizes[], char *msgComps[])
This function allows the user to send multiple messages that may be destined for the SAME PE in one go. This is more efficient than sending each message to the destination node separately. This function assumes that the handlers that are to receive this message have already been set. If this is not done, the behavior of the function is undefined.

In the function, The destPE parameter identifies the destination processor. The len argument identifies the number of messages that are to be sent in one go. The sizes[] array is an array of sizes of each of these messages. The msgComps[] array is the array of the messages. The indexing in each array is from 0 to len - 1. ( Note: Before calling this function, the program needs to initialize the system to be able to provide this service. This is done by calling the function CmiInitMultipleSendRoutine . Unless this function is called, the system will not be able to provide the service to the user.)

2 . 6 Broadcasting Messages

void CmiSyncBroadcast(unsigned int size, void *msg)
Sends msg of length size bytes to all processors excluding the processor on which the caller resides.

void CmiSyncNodeBroadcast(unsigned int size, void *msg)
Sends msg of length size bytes to all nodes excluding the node on which the caller resides.

void CmiSyncBroadcastAndFree(unsigned int size, void *msg)
Sends msg of length size bytes to all processors excluding the processor on which the caller resides. Uses CmiFree to deallocate the message buffer for msg when the broadcast completes. Therefore msg must point to a buffer allocated with CmiAlloc .

void CmiSyncNodeBroadcastAndFree(unsigned int size, void *msg)
Sends msg of length size bytes to all nodes excluding the node on which the caller resides. Uses CmiFree to deallocate the message buffer for msg when the broadcast completes. Therefore msg must point to a buffer allocated with CmiAlloc .

void CmiSyncBroadcastAll(unsigned int size, void *msg)
Sends msg of length size bytes to all processors including the processor on which the caller resides. This function does not free the message buffer for msg.

void CmiSyncNodeBroadcastAll(unsigned int size, void *msg)
Sends msg of length size bytes to all nodes including the node on which the caller resides. This function does not free the message buffer for msg.

void CmiSyncBroadcastAllAndFree(unsigned int size, void *msg)
Sends msg of length size bytes to all processors including the processor on which the caller resides. This function frees the message buffer for msg before returning, so msg must point to a dynamically allocated buffer.

void CmiSyncNodeBroadcastAllAndFree(unsigned int size, void *msg)
Sends msg of length size bytes to all nodes including the node on which the caller resides. This function frees the message buffer for msg before returning, so msg must point to a dynamically allocated buffer.

CmiCommHandle CmiAsyncBroadcast(unsigned int size, void *msg)
Initiates asynchronous broadcast of message msg of length size bytes to all processors excluding the processor on which the caller resides. It returns a communication handle which could be used to check the status of this send using CmiAsyncMsgSent . If the returned communication handle is 0, message buffer can be reused immediately, thus saving a call to CmiAsyncMsgSent . msg should not be overwritten or freed before the communication is complete.

CmiCommHandle CmiAsyncNodeBroadcast(unsigned int size, void *msg)
Initiates asynchronous broadcast of message msg of length size bytes to all nodes excluding the node on which the caller resides. It returns a communication handle which could be used to check the status of this send using CmiAsyncMsgSent . If the returned communication handle is 0, message buffer can be reused immediately, thus saving a call to CmiAsyncMsgSent . msg should not be overwritten or freed before the communication is complete.

CmiCommHandle CmiAsyncBroadcastAll(unsigned int size, void *msg)
Initiates asynchronous broadcast of message msg of length size bytes to all processors including the processor on which the caller resides. It returns a communication handle which could be used to check the status of this send using CmiAsyncMsgSent . If the returned communication handle is 0, message buffer can be reused immediately, thus saving a call to CmiAsyncMsgSent . msg should not be overwritten or freed before the communication is complete.

CmiCommHandle CmiAsyncNodeBroadcastAll(unsigned int size, void *msg)
Initiates asynchronous broadcast of message msg of length size bytes to all nodes including the node on which the caller resides. It returns a communication handle which could be used to check the status of this send using CmiAsyncMsgSent . If the returned communication handle is 0, message buffer can be reused immediately, thus saving a call to CmiAsyncMsgSent . msg should not be overwritten or freed before the communication is complete.


2 . 7 Multicasting Messages

typedef ... CmiGroup;
A CmiGroup represents a set of processors. It is an opaque type. Group IDs are useful for the multicast functions below.

CmiGroup CmiEstablishGroup(int npes, int *pes);
Converts an array of processor numbers into a group ID. Group IDs are useful for the multicast functions below. Caution: this call uses up some resources. In particular, establishing a group uses some network bandwidth (one broadcast's worth) and a small amount of memory on all processors.

void CmiSyncMulticast(CmiGroup grp, unsigned int size, void *msg)
Sends msg of length size bytes to all members of the specified group. Group IDs are created using CmiEstablishGroup .

void CmiSyncMulticastAndFree(CmiGroup grp, unsigned int size, void *msg)
Sends msg of length size bytes to all members of the specified group. Uses CmiFree to deallocate the message buffer for msg when the broadcast completes. Therefore msg must point to a buffer allocated with CmiAlloc . Group IDs are created using CmiEstablishGroup .

CmiCommHandle CmiAsyncMulticast(CmiGroup grp, unsigned int size, void *msg)
( Note: Not yet implemented.) Initiates asynchronous broadcast of message msg of length size bytes to all members of the specified group. It returns a communication handle which could be used to check the status of this send using CmiAsyncMsgSent . If the returned communication handle is 0, message buffer can be reused immediately, thus saving a call to CmiAsyncMsgSent . msg should not be overwritten or freed before the communication is complete. Group IDs are created using CmiEstablishGroup .

void CmiSyncListSend(int npes, int *pes, unsigned int size, void *msg)
Sends msg of length size bytes to npes processors in the array pes.

void CmiSyncListSendAndFree(int npes, int *pes, unsigned int size, void *msg)
Sends msg of length size bytes to npes processors in the array pes. Uses CmiFree to deallocate the message buffer for msg when the multicast completes. Therefore, msg must point to a buffer allocated with CmiAlloc .

CmiCommHandle CmiAsyncListSend(int npes, int *pes, unsigned int size, void *msg)
Initiates asynchronous multicast of message msg of length size bytes to npes processors in the array pes. It returns a communication handle which could be used to check the status of this send using CmiAsyncMsgSent . If the returned communication handle is 0, message buffer can be reused immediately, thus saving a call to CmiAsyncMsgSent . msg should not be overwritten or freed before the communication is complete.


2 . 8 Reducing Messaging

Reductions are operations for which a message (or user data structure) is contributed by each participant processor. All these contributions are merged according to a merge-function provided by the user. A Converse handler is then invoked with the resulting message. Reductions can be on the entire set of processors, or on a subset of the whole. Currently reductions are only implemented on processors sets. No equivalent exists for SMP nodes.

There are eight functions used to deposit a message into the system, summarized in Table  2.1 . Half of them receive as contribution a Converse message (with a Converse header at its beginning). This message must have already been set for delivery to the desired handler. The other half (ending with ``Struct'') receives a pointer to a data structure allocated by the user. This second version may allow the user to write a simpler merging function. For instance, the data structure could be a tree that can be easily expanded by adding more nodes.


Table 2.1: Reductions functions in Converse
  global global with ID processor set CmiGroup
message CmiReduce CmiReduceID CmiListReduce CmiGroupReduce
data CmiReduceStruct CmiReduceStructID CmiListReduceStruct CmiGroupReduceStruct


The signatures for the functions in Table  2.1 are:

void CmiReduce(void *msg, int size, CmiReduceMergeFn mergeFn);
void CmiReduceStruct(void *data, CmiReducePupFn pupFn, CmiReduceMergeFn mergeFn, CmiHandler dest, CmiReduceDeleteFn deleteFn);
void CmiReduceID(void *msg, int size, CmiReduceMergeFn mergeFn, CmiReductionID id);
void CmiReduceStructID(void *data, CmiReducePupFn pupFn, CmiReduceMergeFn mergeFn, CmiHandler dest, CmiReduceDeleteFn deleteFn, CmiReductionID id);
void CmiListReduce(int npes, int *pes, void *msg, int size, CmiReduceMergeFn mergeFn, CmiReductionID id);
void CmiListReduceStruct(int npes, int *pes, void *data, CmiReducePupFn pupFn, CmiReduceMergeFn mergeFn, CmiHandler dest, CmiReduceDeleteFn deleteFn, CmiReductionID id);
void CmiGroupReduce(CmiGroup grp, void *msg, int size, CmiReduceMergeFn mergeFn, CmiReductionID id);
void CmiGroupReduceStruct(CmiGroup grp, void *data, CmiReducePupFn pupFn, CmiReduceMergeFn mergeFn, CmiHandler dest, CmiReduceDeleteFn deleteFn, CmiReductionID id);

In all the above, msg is the Converse message deposited by the local processor, size is the size of the message msg, and data is a pointer to the user-allocated data structure deposited by the local processor. dest is the CmiHandler where the final message shall be delivered. It is explicitly passed in ``Struct'' functions only, since for the message versions it is taken from the header of msg. Moreover there are several other function pointers passed in by the user:

void * (*mergeFn)(int *size, void *local, void **remote, int count)
Prototype for a CmiReduceMergeFn function pointer argument. This function is used in all the CmiReduce forms to merge the local message/data structure deposited on a processor with all the messages incoming from the children processors of the reduction spanning tree. The input parameters are in the order: the size of the local data for message reductions (always zero for struct reductions); the local data itself (the exact same pointer passed in as first parameter of CmiReduce and similar); a pointer to an array of incoming messages; the number of elements in the second parameter. The function returns a pointer to a freshly allocated message (or data structure for the Struct forms) corresponding to the merge of all the messages. When performing message reductions, this function is also responsible to updating the integer pointed by size to the new size of the returned message. All the messages in the remote array are deleted by the system; the data pointed by the first parameter should be deleted by this function. If the data can be merged ``in-place'' by modifying or augmenting local, the function can return the same pointer to local which can be considered freshly allocated. Each element in remote is the complete incoming message (including the converse header) for message reductions, and the data as it has been packed by the pup function (without any additional header) for struct reductions.

void (*pupFn)(pup_er p, void *data)
Prototype for a CmiReducePupFn function pointer argument. This function will use the PUP framework to pup the data passed in into a message for sending across the network. The data can be either the same data passed in as first parameter of any ``Struct'' function, or the return of the merge function. It will be called for sizing and packing. ( Note: It will not be called for unpacking.)

void (*deleteFn)(void *ptr)
Prototype for a CmiReduceDeleteFn function pointer argument. This function is used to delete either the data structure passed in as first parameter of any ``Struct'' function, or the return of the merge function. It can be as simple as ``free'' or as complicated as needed to delete complex structures. If this function is NULL, the data structure will not be deleted, and the program can continue to use it. Note: even if this function is NULL, the input data structure may still be modified by the merge function.

CmiReduce and CmiReduceStruct are the simplest reduction function, and they reduce the deposited message/data across all the processors in the system. Each processor must to call this function exactly once. Multiple reductions can be invoked without waiting for previous ones to finish, but the user is responsible to call CmiReduce/CmiReduceStruct in the same order on every processor. ( Note: CmiReduce and CmiReduceStruct are not interchangeable. Either every processor calls CmiReduce or every processor calls CmiReduceStruct).

In situations where it is not possible to guarantee the order of reductions, the user may use CmiReduceID or CmiReduceStructID. These functions have an additional parameter of type CmiReductionID which will uniquely identify the reduction, and match them correctly. ( Note: No two reductions can be active at the same time with the same CmiReductionID. It is up to the user to guarantee this.)

A CmiReductionID can be obtained by the user in three ways, using one of the following functions:

CmiReductionID CmiGetGlobalReduction()
This function must be called on every processor, and in the same order if called multiple times. This would generally be inside initialization code, that can set aside some CmiReductionIDs for later use.

CmiReductionID CmiGetDynamicReduction()
This function may be called only on processor zero. It returns a unique ID, and it is up to the user to distribute this ID to any processor that needs it.

void CmiGetDynamicReductionRemote(int handlerIdx, int pe, int dataSize, void *data)
This function may be called on any processor. The produced CmiReductionID is returned on the specified pe by sending a message to the specified handlerIdx. If pe is -1, then all processors will receive the notification message. data can be any data structure that the user wants to receive on the specified handler (for example to differentiate between requests). dataSize is the size in bytes of data. If dataSize is zero, data is ignored. The message received by handlerIdx consists of the standard Converse header, followed by the requested CmiReductionID (represented as a 4 bytes integer the user can cast to a CmiReductionID , a 4 byte integer containing dataSize, and the data itself.

The other four functions (CmiListReduce, CmiListReduceStruct, CmiGroupReduce, CmiGroupReduceStruct) are used for reductions over subsets of processors. They all require a CmiReductionID that the user must obtain in one of the ways described above. The user is also responsible that no two reductions use the same CmiReductionID simultaneously. The first two functions receive the subset description as processor list (pes) of size npes. The last two receive the subset description as a previously established CmiGroup (see  2.7 ).


2 . 9 Scheduling Messages

The scheduler queue is a powerful priority queue. The following functions can be used to place messages into the scheduler queue. These messages are treated very much like newly-arrived messages: when they reach the front of the queue, they trigger handler functions, just like messages transmitted with CMI functions. Note that unlike the CMI send functions, these cannot move messages across processors.

Every message inserted into the queue has a priority associated with it. Converse priorities are arbitrary-precision numbers between 0 and 1. Priorities closer to 0 get processed first, priorities closer to 1 get processed last. Arbitrary-precision priorities are very useful in AI search-tree applications. Suppose we have a heuristic suggesting that tree node N1 should be searched before tree node N2. We therefore designate that node N1 and its descendants will use high priorities, and that node N2 and its descendants will use lower priorities. We have effectively split the range of possible priorities in two. If several such heuristics fire in sequence, we can easily split the priority range in two enough times that no significant bits remain, and the search begins to fail for lack of meaningful priorities to assign. The solution is to use arbitrary-precision priorities, aka bitvector priorities.

These arbitrary-precision numbers are represented as bit-strings: for example, the bit-string ``0011000101'' represents the binary number (.0011000101). The format of the bit-string is as follows: the bit-string is represented as an array of unsigned integers. The most significant bit of the first integer contains the first bit of the bitvector. The remaining bits of the first integer contain the next 31 bits of the bitvector. Subsequent integers contain 32 bits each. If the size of the bitvector is not a multiple of 32, then the last integer contains 0 bits for padding in the least-significant bits of the integer.

Some people only want regular integers as priorities. For simplicity's sake, we provide an easy way to convert integer priorities to Converse 's built-in representation.

In addition to priorities, you may choose to enqueue a message ``LIFO'' or ``FIFO''. Enqueueing a message ``FIFO'' simply pushes it behind all the other messages of the same priority. Enqueueing a message ``LIFO'' pushes it in front of other messages of the same priority.

Messages sent using the CMI functions take precedence over everything in the scheduler queue, regardless of priority.

A recent addition to Converse scheduling mechanisms is the introduction of node-level scheduling designed to support low-overhead programming for the SMP clusters. These functions have ``Node'' in their names. All processors within the node has access to the node-level scheduler's queue, and thus a message enqueued in a node-level queue may be handled by any processor within that node. When deciding about which message to process next, i.e. from processor's own queue or from the node-level queue, a quick priority check is performed internally, thus a processor views scheduler's queue as a single prioritized queue that includes messages directed at that processor and messages from the node-level queue sorted according to priorities.

void CsdEnqueueGeneral(void *Message, int strategy, int priobits, int *prioptr)
This call enqueues a message to the processor's scheduler's queue, to be sorted according to its priority and the queueing strategy . The meaning of the priobits and prioptr fields depend on the value of strategy, which are explained below.

void CsdNodeEnqueueGeneral(void *Message, int strategy, int priobits, int *prioptr)
This call enqueues a message to the node-level scheduler's queue, to be sorted according to its priority and the queueing strategy. The meaning of the priobits and prioptr fields depend on the value of strategy, which can be any of the following:

Caution: the priority itself is not copied by the scheduler. Therefore, if you pass a pointer to a priority into the scheduler, you must not overwrite or free that priority until after the message has emerged from the scheduler's queue. It is normal to actually store the priority in the message itself , though it is up to the user to actually arrange storage for the priority.

void CsdEnqueue(void *Message)
This macro is a shorthand for


 CsdEnqueueGeneral(Message, CQS_QUEUEING_FIFO,0, NULL) 

provided here for backward compatibility.

void CsdNodeEnqueue(void *Message)
This macro is a shorthand for


 CsdNodeEnqueueGeneral(Message, CQS_QUEUEING_FIFO,0, NULL) 

provided here for backward compatibility.

void CsdEnqueueFifo(void *Message)
This macro is a shorthand for


 CsdEnqueueGeneral(Message, CQS_QUEUEING_FIFO,0, NULL)

provided here for backward compatibility.

void CsdNodeEnqueueFifo(void *Message)
This macro is a shorthand for


 CsdNodeEnqueueGeneral(Message, CQS_QUEUEING_FIFO,0, NULL)

provided here for backward compatibility.

void CsdEnqueueLifo(void *Message)
This macro is a shorthand for


 CsdEnqueueGeneral(Message, CQS_QUEUEING_LIFO,0, NULL)

provided here for backward compatibility.

void CsdNodeEnqueueLifo(void *Message)
This macro is a shorthand for


 CsdNodeEnqueueGeneral(Message, CQS_QUEUEING_LIFO,0, NULL) 

provided here for backward compatibility.

int CsdEmpty()
This function returns non-zero integer when the scheduler's processor-level queue is empty, zero otherwise.

int CsdNodeEmpty()
This function returns non-zero integer when the scheduler's node-level queue is empty, zero otherwise.


2 . 10 Polling for Messages

As we stated earlier, Converse messages trigger handler functions when they arrive. In fact, for this to work, the processor must occasionally poll for messages. When the user starts Converse , he can put it into one of several modes. In the normal mode, the message polling happens automatically. However user-calls-scheduler mode is designed to let the user poll manually. To do this, the user must use one of two polling functions: CmiDeliverMsgs , or CsdScheduler . CsdScheduler is more general, it will notice any Converse event. CmiDeliverMsgs is a lower-level function that ignores all events except for recently-arrived messages. (In particular, it ignores messages in the scheduler queue). You can save a tiny amount of overhead by using the lower-level function. We recommend the use of CsdScheduler for all applications except those that are using only the lowest level of Converse , the CMI. A third polling function, CmiDeliverSpecificMsg , is used when you know the exact event you want to wait for: it does not allow any other event to occur.

In each iteration, a scheduler first looks for any message that has arrived from another processor, and delivers it. If there isn't any, it selects a message from the locally enqueued messages, and delivers it.

void CsdScheduleForever(void)
Extract and deliver messages until the scheduler is stopped. Raises the idle handling converse signals. This is the scheduler to use in most Converse programs.

int CsdScheduleCount(int n)
Extract and deliver messages until $n$ messages have been delivered, then return 0. If the scheduler is stopped early, return $n$ minus the number of messages delivered so far. Raises the idle handling converse signals.

void CsdSchedulePoll(void)
Extract and deliver messages until no more messages are available, then return. This is useful for running non-networking code when the networking code has nothing to do.

void CsdScheduler(int n)
If $n$ is zero, call CsdSchedulePoll. If $n$ is negative, call CsdScheduleForever. If $n$ is positive, call CsdScheduleCount( $n$ ).

int CmiDeliverMsgs(int MaxMsgs)
Retrieves messages from the network message queue and invokes corresponding handler functions for arrived messages. This function returns after either the network message queue becomes empty or after MaxMsgs messages have been retrieved and their handlers called. It returns the difference between total messages delivered and MaxMsgs. The handler is given a pointer to the message as its parameter.

void CmiDeliverSpecificMsg(int HandlerId)
Retrieves messages from the network queue and delivers the first message with its handler field equal to HandlerId. This functions leaves alone all other messages. It returns after the invoked handler function returns.

void CsdExitScheduler(void)
This call causes CsdScheduler to stop processing messages when control has returned back to it. The scheduler then returns to its calling routine.

2 . 11 The Timer

double CmiTimer(void)
Returns current value of the timer in seconds. This is typically the time spent since the ConverseInit call. The precision of this timer is the best available on the particular machine, and usually has at least microsecond accuracy.

2 . 12 Processor Ids

int CmiNumPe(void)
Returns the total number of processors on which the parallel program is being run.

int CmiMyPe(void)
Returns the logical processor identifier of processor on which the caller resides. A processor Id is between 0 and CmiNumPe ()-1 .

Also see the calls in Section  2.13.2 .


2 . 13 Global Variables and Utility functions

Different vendors are not consistent about how they treat global and static variables. Most vendors write C compilers in which global variables are shared among all the processors in the node. A few vendors write C compilers where each processor has its own copy of the global variables. In theory, it would also be possible to design the compiler so that each thread has its own copy of the global variables.

The lack of consistency across vendors, makes it very hard to write a portable program. The fact that most vendors make the globals shared is inconvenient as well, usually, you don't want your globals to be shared. For these reasons, we added ``pseudoglobals'' to Converse . These act much like C global and static variables, except that you have explicit control over the degree of sharing.

In this section we use the terms Node, PE, and User-level thread as they are used in Charm++, to refer to an OS process, a worker/communication thread, and a user-level thread, respectively. In the SMP mode of Charm++ all three of these are separate entities, whereas in non-SMP mode Node and PE have the same scope.

2 . 13 . 1 Converse PseudoGlobals

Three classes of pseudoglobal variables are supported: node-shared, processor-private, and thread-private variables.

Node-shared variables (Csv)
are specific to a node. They are shared among all the PEs within the node.
PE-private variables (Cpv)
are specific to a PE. They are shared by all the objects and Converse user-level threads on a PE.
Thread-private variables (Ctv)
are specific to a Converse user-level thread. They are truly private.

There are five macros for each class. These macros are for declaration, static declaration, extern declaration, initialization, and access. The declaration, static and extern specifications have the same meaning as in C. In order to support portability, however, the global variables must be installed properly, by using the initialization macros. For example, if the underlying machine is a simulator for the machine model supported by Converse , then the thread-private variables must be turned into arrays of variables. Initialize and Access macros hide these details from the user. It is possible to use global variables without these macros, as supported by the underlying machine, but at the expense of portability.

Macros for node-shared variables:

CsvDeclare(type,variable)
CsvStaticDeclare(type,variable)
CsvExtern(type,variable)
CsvInitialize(type,variable)
CsvAccess(variable)

Macros for PE-private variables:

CpvDeclare(type,variable)
CpvStaticDeclare(type,variable)
CpvExtern(type,variable)
CpvInitialize(type,variable)
CpvAccess(variable)

Macros for thread-private variables:

CtvDeclare(type,variable)
CtvStaticDeclare(type,variable)
CtvExtern(type,variable)
CtvInitialize(type,variable)
CtvAccess(variable)

A sample code to illustrate the usage of the macros is provided in Figure  2.1 . There are a few rules that the user must pay attention to: The type and variable fields of the macros must be a single word. Therefore, structures or pointer types can be used by defining new types with the typedef . In the sample code, for example, a struct point type is redefined with a typedef as Point in order to use it in the macros. Similarly, the access macros contain only the name of the global variable. Any indexing or member access must be outside of the macro as shown in the sample code (function func1 ). Finally, all the global variables must be installed before they are used. One way to do this systematically is to provide a module-init function for each file (in the sample code - ModuleInit() . The module-init functions of each file, then, can be called at the beginning of execution to complete the installations of all global variables.

Figure 2.1: An example code for global variable usage
\begin{figure}\par
\begin{alltt}
\relax{} File: Module1.c
\\
 typedef struct ...
... CpvAccess(p).y = CpvAccess(p).x + 1;
 \}
 \relax \end{alltt}
\end{figure}


2 . 13 . 2 Utility Functions

To further simplify programming with global variables on shared memory machines, Converse provides the following functions and/or macros. ( Note: These functions are defined on machines other than shared-memory machines also, and have the effect of only one processor per node and only one thread per processor.)

int CmiMyNode()
Returns the node number to which the calling processor belongs.

int CmiNumNodes()
Returns number of nodes in the system. Note that this is not the same as CmiNumPes() .

int CmiMyRank()
Returns the rank of the calling processor within a shared memory node.

int CmiNodeFirst(int node)
Returns the processor number of the lowest ranked processor on node node

int CmiNodeSize(int node)
Returns the number of processors that belong to the node node .

int CmiNodeOf(int pe)
Returns the node number to which processor pe belongs. Indeed, CmiMyNode() is a utility macro that is aliased to CmiNodeOf(CmiMyPe()) .

int CmiRankOf(int pe)
Returns the rank of processor pe in the node to which it belongs.


2 . 13 . 3 Node-level Locks and other Synchronization Mechanisms

void CmiNodeBarrier()
Provide barrier synchronization at the node level, i.e. all the processors belonging to the node participate in this barrier.

typedef McDependentType CmiNodeLock
This is the type for all the node-level locks in Converse .

CmiNodeLock CmiCreateLock(void)
Creates, initializes and returns a new lock. Initially the lock is unlocked.

void CmiLock(CmiNodeLock lock)
Locks lock . If the lock has been locked by other processor, waits for lock to be unlocked.

void CmiUnlock(CmiNodeLock lock)
Unlocks lock . Processors waiting for the lock can then compete for acquiring lock .

int CmiTryLock(CmiNodeLock lock)
Tries to lock lock . If it succeeds in locking, it returns 0. If any other processor has already acquired the lock, it returns 1.

voi CmiDestroyLock(CmiNodeLock lock)
Frees any memory associated with lock . It is an error to perform any operations with lock after a call to this function.

2 . 14 Input/Output

void CmiPrintf(char *format, arg1, arg2, ...)
This function does an atomic printf() on stdout . On machine with host, this is implemented on top of the messaging layer using asynchronous sends.

int CmiScanf(char *format, void *arg1, void *arg2, ...)
This function performs an atomic scanf from stdin . The processor, on which the caller resides, blocks for input. On machines with host, this is implemented on top of the messaging layer using asynchronous send and blocking receive.

void CmiError(char *format, arg1, arg2, ...)
This function does an atomic printf() on stderr . On machines with host, this is implemented on top of the messaging layer using asynchronous sends.

2 . 15 Spanning Tree Calls

Sometimes, it is convenient to view the processors/nodes of the machine as a tree. For this purpose, Converse defines a tree over processors/nodes. We provide functions to obtain the parent and children of each processor/node. On those machines where the communication topology is relevant, we arrange the tree to optimize communication performance. The root of the spanning tree (processor based or node-based) is always 0, thus the CmiSpanTreeRoot call has been eliminated.

int CmiSpanTreeParent(int procNum)
This function returns the processor number of the parent of procNum in the spanning tree.

int CmiNumSpanTreeChildren(int procNum)
Returns the number of children of procNum in the spanning tree.

void CmiSpanTreeChildren(int procNum, int *children)
This function fills the array children with processor numbers of children of procNum in the spanning tree.

int CmiNodeSpanTreeParent(int nodeNum)
This function returns the node number of the parent of nodeNum in the spanning tree.

int CmiNumNodeSpanTreeChildren(int nodeNum)
Returns the number of children of nodeNum in the spanning tree.

void CmiNodeSpanTreeChildren(int nodeNum, int *children)
This function fills the array children with node numbers of children of nodeNum in the spanning tree.

2 . 16 Isomalloc

It is occasionally useful to allocate memory at a globally unique virtual address. This is trivial on a shared memory machine (where every address is globally unique); but more difficult on a distributed memory machine (where each node has its own separate data at address 0x80000000). Isomalloc provides a uniform interface for allocating globally unique virtual addresses.

Isomalloc can thus be thought of as a software distributed shared memory implementation; except data movement between processors is explicit (by making a subroutine call), not on demand (by taking a page fault).

Isomalloc is useful when moving highly interlinked data structures from one processor to another, because internal pointers will still point to the correct locations, even on a new processor. This is especially useful when the format of the data structure is complex or unknown, as with thread stacks.

void *CmiIsomalloc(int size)
Allocate size bytes at a unique virtual address. Returns a pointer to the allocated region.

CmiIsomalloc makes allocations with page granularity (typically several kilobytes); so it is not recommended for small allocations.

void CmiIsomallocFree(void *doomedBlock)
Release the given block, which must have been previously returned by CmiIsomalloc. Also releases the used virtual address range, which the system may subsequently reuse.

After a CmiIsomallocFree, references to that block will likely result in a segmentation violation. It is illegal to call CmiIsomallocFree more than once on the same block.

void CmiIsomallocPup(pup_er p,void **block)
Pack/Unpack the given block. This routine can be used to move blocks across processors, save blocks to disk, or checkpoint blocks.

After unpacking, the pointer is guaranteed to have the same value that it did before packing.

Note- Use of this function to pup individual blocks is not supported any longer. All the blocks allocated via CmiIsomalloc are pupped by the RTS as one single unit.

int CmiIsomallocLength(void *block);
Return the length, in bytes, of this isomalloc'd block.

int CmiIsomallocInRange(void *address)
Return 1 if the given address may have been previously allocated to this processor using Isomalloc; 0 otherwise. CmiIsomallocInRange(malloc(size)) is guaranteed to be zero; CmiIsomallocInRange(CmiIsomalloc(size)) is guaranteed to be one.