Project

General

Profile

Feature #387

Pack variable-envelope structures to reduce overhead bytes

Added by Phil Miller over 5 years ago. Updated about 1 year ago.

Status:
New
Priority:
Low
Assignee:
-
Category:
-
Target version:
-
Start date:
12/30/2013
Due date:
% Done:

0%


Description

Currently, the various variable-envelope structures (s_array, s_objid, s_group, etc) are declared such that the compiler will pad them at the end so that they meet their individual alignment restrictions when allocated as part of an array. These objects will never be part of an array, though, and so don't need that padding. Right now, that padding is included in every message we send on the wire, since it's included in the value of sizeof(s_foo) used in message-size calculations. If we enable #pragma pack for these structures, we can trim some bytes from just about every message we send.

History

#1 Updated by Phil Miller over 5 years ago

The primary barrier to this may well be compiler support. It's documented for at least GCC, Clang, MSVC++, XLC. We'll definitely need to test ICC, PGCC, and Cray. Even without compiler support, the code should still work fine with such a declaration, just without the optimization benefit. The thing to be cautious of is that some compilers have been known to have bugs in their implementation of #pragma pack, though usually involving the sorts of vectorization of array access that's irrelevant here.

#2 Updated by Phil Miller over 5 years ago

ICC 13.1 won't pack the objid type defined in #170, complaining (in a warning on each compilation unit) that it's non-POD. ICC 14.0 handles it correctly, and tests run successfully.

#3 Updated by Lukasz Wesolowski over 5 years ago

We currently rely on a constant-sized envelope in several places. For example, how would pointer manipulations such as in UsrToEnv work with this approach?

#4 Updated by Phil Miller over 5 years ago

The packing pragma is a static compile-time change in the size of the objects. For every declared instance of affected types, it determines whether there will be suitable end-of-object padding to align a subsequent instance of the same type in an array. So, the envelope would be a constant size everywhere, but that size might end up being smaller.

#5 Updated by Phil Miller over 5 years ago

  • Target version set to 6.7.0

#6 Updated by Nikhil Jain over 3 years ago

  • Target version changed from 6.7.0 to 6.8.0

#7 Updated by Sam White almost 3 years ago

Should we revisit compiler support for this now that 64bit ID is just about ready to be merged?

#8 Updated by Phil Miller almost 3 years ago

We should revisit the size and layout of the envelope with this kind of change in mind. I'd suggest using the pahole utility ('poke a hole') that shows detailed structure layout including padding based on debug information.

#9 Updated by Phil Miller over 2 years ago

  • Target version changed from 6.8.0 to 6.9.0

#10 Updated by Phil Miller over 1 year ago

  • Target version deleted (6.9.0)

#11 Updated by Phil Miller over 1 year ago

  • Assignee deleted (Phil Miller)
  • Priority changed from Normal to Low

#12 Updated by Sam White about 1 year ago

Just by reordering one member I got the size of the Charm message envelope to shrink from 80 to 64 bytes. Details are here: https://charm.cs.illinois.edu/gerrit/#/c/charm/+/3872/

Before:

*** Dumping AST Record Layout
         0 | class envelope
         0 |   char [28] core
        32 |   union ck::impl::u_type type
        32 |     struct ck::impl::u_type::s_chare chare
        32 |       void * ptr
        40 |       UInt forAnyPe
        44 |       int bype
        32 |     struct ck::impl::u_type::s_group group
        32 |       struct _ckGroupID g
        32 |         int idx
        36 |       struct _ckGroupID rednMgr
        36 |         int idx
        40 |       int epoch
        44 |       UShort arrayEp
        32 |     struct ck::impl::u_type::s_array array
        32 |       CmiUInt8 id
        40 |       struct _ckGroupID arr
        40 |         int idx
        44 |       UChar hopCount
        45 |       UChar ifNotThere
        32 |     struct ck::impl::u_type::s_roData roData
        32 |       UInt count
        32 |     struct ck::impl::u_type::s_roMsg roMsg
        32 |       UInt roIdx
        48 |   UInt pe
        52 |   UInt totalsize
        56 |   unsigned short ref
        58 |   UShort priobits
        60 |   UShort groupDepNum
        62 |   UShort epIdx
        64 |   struct ck::impl::s_attribs attribs
        64 |     UChar msgIdx
        65 |     UChar mtype
    66:0-3 |     UChar queueing
    66:4-4 |     UChar isPacked
    66:5-5 |     UChar isUsed
    66:6-6 |     UChar isRdma
    66:7-7 |     UChar isVarSysMsg
        67 |   UChar [8] align
           | [sizeof=80, dsize=75, align=8,
           |  nvsize=75, nvalign=8]

After:

*** Dumping AST Record Layout
         0 | class envelope
         0 |   char [28] core
        28 |   UInt pe
        32 |   union ck::impl::u_type type
        32 |     struct ck::impl::u_type::s_chare chare
        32 |       void * ptr
        40 |       UInt forAnyPe
        44 |       int bype
        32 |     struct ck::impl::u_type::s_group group
        32 |       struct _ckGroupID g
        32 |         int idx
        36 |       struct _ckGroupID rednMgr
        36 |         int idx
        40 |       int epoch
        44 |       UShort arrayEp
        32 |     struct ck::impl::u_type::s_array array
        32 |       CmiUInt8 id
        40 |       struct _ckGroupID arr
        40 |         int idx
        44 |       UChar hopCount
        45 |       UChar ifNotThere
        32 |     struct ck::impl::u_type::s_roData roData
        32 |       UInt count
        32 |     struct ck::impl::u_type::s_roMsg roMsg
        32 |       UInt roIdx
        48 |   UInt totalsize
        52 |   unsigned short ref
        54 |   UShort priobits
        56 |   UShort groupDepNum
        58 |   UShort epIdx
        60 |   struct ck::impl::s_attribs attribs
        60 |     UChar msgIdx
        61 |     UChar mtype
    62:0-3 |     UChar queueing
    62:4-4 |     UChar isPacked
    62:5-5 |     UChar isUsed
    62:6-6 |     UChar isRdma
    62:7-7 |     UChar isVarSysMsg
        63 |   UChar [0] align
           | [sizeof=64, dsize=63, align=8,
           |  nvsize=63, nvalign=8]

`#pragma pack` could still be beneficial as a more general approach to this, and in other places in the runtime.

Also available in: Atom PDF