6. Serialization Using the PUP Framework

The PUP (Pack/Unpack) framework is a generic way to describe the data in an object and to use that description for serialization. The Charm++ system can use this description to pack the object into a message and unpack the message into a new object on another processor, to pack and unpack migratable objects for load balancing or checkpoint/restart-based fault tolerance. The PUP framework also includes support special for STL containers to ease development in C++. Like many C++ concepts, the PUP framework is easier to use than describe:

class foo : public mySuperclass {
    double a;
    int x;
    char y;
    unsigned long z;
    float arr[3];
    ...other methods...

    //pack/unpack method: describe my fields to charm++
    void pup(PUP::er &p) {
      p|x; p|y; p|z;

This class's pup method describes the fields of the class to Charm++ . This allows Charm++ to marshall parameters of type foo across processors, translate foo objects across processor architectures, read and write foo objects to files on disk, inspect and modify foo objects in the debugger, and checkpoint and restart programs involving foo objects.

6.1 PUP contract

Your object's pup method must save and restore all your object's data. As shown, you save and restore a class's contents by writing a method called ``pup'' which passes all the parts of the class to an object of type PUP::er, which does the saving or restoring. This manual will often use ``pup'' as a verb, meaning ``to save/restore the value of'' or equivalently, ``to call the pup method of''.

Pup methods for complicated objects normally call the pup methods for their simpler parts. Since all objects depend on their immediate superclass, the first line of every pup method is a call to the superclass's pup method--the only time you shouldn't call your superclass's pup method is when you don't have a superclass. If your superclass has no pup method, you must pup the values in the superclass yourself.

6.1.1 PUP operator

The recommended way to pup any object a is to use p|a;. This syntax is an operator | applied to the PUP::er p and the user variable a.

The p|a; syntax works wherever a is:

See examples/charm++/PUP

For container types, you must simply pup each element of the container. For arrays, you can use the utility method PUParray, which takes the PUP::er, the array base pointer, and the array length. This utility method is defined for user-defined types T as:

    template<class T>
    inline void PUParray(PUP::er &p,T *array,int length) {
       for (int i=0;i<length;i++) p|array[i];

6.1.2 PUP STL Container Objects

If the variable is from the C++ Standard Template Library, you can include operator|'s for STL containers such as vector, map, set, list, pair, and string, templated on anything, by including the header ``pup_stl.h''.

See examples/charm++/PUP/STLPUP

6.1.3 PUP Dynamic Data

As usual in C++ , pointers and allocatable objects usually require special handling. Typically this only requires a p.isUnpacking() conditional block, where you perform the appropriate allocation. See Section 18.1 for more information and examples.

If the object does not have a pup method, and you cannot add one or use PUPbytes, you can define an operator| to pup the object. For example, if myClass contains two fields a and b, the operator| might look like:

  inline void operator|(PUP::er &p,myClass &c) {

See examples/charm++/PUP/HeapPUP

6.1.4 PUP as bytes

For classes and structs with many fields, it can be tedious and error-prone to list all the fields in the pup method. You can avoid this listing in two ways, as long as the object can be safely copied as raw bytes--this is normally the case for simple structs and classes without pointers.

Note that pupping as bytes is just like using `memcpy': it does nothing to the data other than copy it whole. For example, if the class contains any pointers, you must make sure to do any allocation needed, and pup the referenced data yourself.

Pupping as bytes may prevent your pup method from ever being able to work across different machine architectures. This is currently an uncommon scenario, but heterogeneous architectures may become more common, so pupping as bytes is discouraged.

6.1.5 PUP overhead

The PUP::er overhead is very small--one virtual function call for each item or array to be packed/unpacked. The actual packing/unpacking is normally a simple memory-to-memory binary copy.

For arrays and vectors of builtin arithmetic types like ``int" and ``double", or of types declared as ``PUPbytes'', PUParray uses an even faster block transfer, with one virtual function call per array or vector.

Thus, if an object does not contain pointers, you should prefer declaring it as PUPbytes.

For types of objects whose default constructors do more than necessary when an object will be unpacked from PUP, it is possible to tell the runtime system to call a more minimalistic alternative. This can apply to types used as both member variables of chares and as marshalled arguments to entry methods. A non-chare class can define a constructor that takes an argument of type PUP::reconstruct for this purpose. The runtime system code will call a PUP::reconstruct constructor in preference to a default constructor when it's available. Where necessary, constructors taking PUP::reconstruct should call the constructors of members variables with PUP::reconstruct if applicable to that member.

6.1.6 PUP modes

Charm++ uses your pup method to both pack and unpack, by passing different types of PUP::ers to it. The method p.isUnpacking() returns true if your object is being unpacked--that is, your object's values are being restored. Your pup method must work properly in sizing, packing, and unpacking modes; and to save and restore properly, the same fields must be passed to the PUP::er, in the exact same order, in all modes. This means most pup methods can ignore the pup mode.

Three modes are used, with three separate types of PUP::er: sizing, which only computes the size of your data without modifying it; packing, which reads/saves values out of your data; and unpacking, which writes/restores values into your data. You can determine exactly which type of PUP::er was passed to you using the p.isSizing(), p.isPacking(), and p.isUnpacking() methods. However, sizing and packing should almost always be handled identically, so most programs should use p.isUnpacking() and !p.isUnpacking(). Any program that calls p.isPacking() and does not also call p.isSizing() is probably buggy, because sizing and packing must see exactly the same data.

The p.isDeleting() flag indicates the object will be deleted after calling the pup method. This is normally only needed for pup methods called via the C or f90 interface, as provided by AMPI or the other frameworks. Other Charm++ array elements, marshalled parameters, and other C++ interface objects have their destructor called when they are deleted, so the p.isDeleting() call is not normally required--instead, memory should be deallocated in the destructor as usual.

More specialized modes and PUP::ers are described in section 18.4.

6.2 PUP Usage Sequence

Figure 6.1: Method sequence of an object with a pup method.
Image pup

Typical method invocation sequence of an object with a pup method is shown in Figure 6.1. As usual in C++ , objects are constructed, do some processing, and are then destroyed.

Objects can be created in one of two ways: they can be created using a normal constructor as usual; or they can be created using their pup constructor. The pup constructor for Charm++ array elements and PUP::able objects is a ``migration constructor'' that takes a single ``CkMigrateMessage *"; for other objects, such as parameter marshalled objects, the pup constructor has no parameters. The pup constructor is always followed by a call to the object's pup method in isUnpacking mode.

Once objects are created, they respond to regular user methods and remote entry methods as usual. At any time, the object pup method can be called in isSizing or isPacking mode. User methods and sizing or packing pup methods can be called repeatedly over the object lifetime.

Finally, objects are destroyed by calling their destructor as usual.

6.3 Migratable Array Elements using PUP

Array objects can migrate from one PE to another. For example, the load balancer (see section 7.1) might migrate array elements to better balance the load between processors.For an array element to be migratable, it must implement a pup method. The standard PUP contract (see section 6.1) and constraints wrt to serializing data apply. The one exception for chare , group and node group types is that since the runtime system will be the one to invoke their PUP routines, the runtime will automatically call PUP on the generated CBase_ superclasses so users do not need to call PUP on generated superclasses.

A simple example for an array follows:

//In the .h file:

class A2 : public CBase_A2 {

private: //My data members:
    int nt;
    unsigned char chr;
    float flt[7];
    int numDbl;
    double *dbl;

    //...other declarations

    virtual void pup(PUP::er &p);

//In the .C file:

void A2::pup(PUP::er &p)
    // The runtime will automatically call CBase_A2::pup()
    if (p.isUnpacking()) dbl=new double[numDbl];

The default assumption, as used in the example above, for the object state at PUP time is that a chare, and its member objects, could be migrated at any time while it is inactive, i.e. not executing an entry method. Actual migration time can be controlled (see section 7.1) to be less frequent. If migration timing is fully user controlled, e.g., at the end of a synchronized load balancing step, then PUP implementation can be simplified to only transport ``live'' ephemeral data matching the object state which coincides with migration. More intricate state based PUPing, for objects whose memory footprint varies substantially with computation phase, can be handled by explicitly maintaining the object's phase in a member variable and implementing phase conditional logic in the PUP method (see section 18.1).

6.4 Marshalling User Defined Data Types via PUP

Parameter marshalling requires serialization and is therefore implemented using the PUP framework. User defined data types passed as parameters must abide by the standard PUP contract (see section 6.1).

A simple example of using PUP to marshall user defined data types follows:

class Buffer {

//...other declarations
  void pup(PUP::er &p) {
    // remember to pup your superclass if there is one
    if (p.isUnpacking())
      data = new int[size];
    PUParray(p, data, size);

  int size;
  int *data;

// In some .ci file

entry void process(Buffer &buf);

For efficiency, arrays are always copied as blocks of bytes and passed via pointers. This means classes that need their pup routines to be called, such as those with dynamically allocated data or virtual methods cannot be passed as arrays-use STL vectors to pass lists of complicated user-defined classes. For historical reasons, pointer-accessible structures cannot appear alone in the parameter list (because they are confused with messages).

The order of marshalling operations on the send side is:

The order of marshalling operations on the receive side is:

Finally, very large structures are most efficiently passed via messages, because messages are an efficient, low-level construct that minimizes copying and overhead; but very complicated structures are often most easily passed via marshalling, because marshalling uses the high-level pup framework.

See examples/charm++/PUP/HeapPUP