Project

General

Profile

Bug #1239

Cleanup reduction uses in the runtime

Added by Sam White almost 3 years ago. Updated over 1 year ago.

Status:
In Progress
Priority:
Normal
Category:
-
Target version:
-
Start date:
09/28/2016
Due date:
% Done:

40%


Description

Here are some best practices for Charm++ reductions that are not always followed in the runtime:

1. You should use built-in reducer types whenever possible, as in nop and tuple reducers.
- For reductions that have no contribution data and are only as synchronization points, prefer contribute(cb). Many places (see src/ck-ldb/ for examples) use contribute(0, NULL, CkReduction::sum_int, cb) instead. You can grep for "contribute(0" to find them.
- Look for custom reducer types (in src/ck-cp/, src/ck-perf/, and src/ck-ldb/) that could be expressed more simply as built-in reductions (most likely as tuple reductions).

2. The callback should be a [reductiontarget] entry method unless the result message is saved by the recv'er to be used later or the contribution is non-POD.
- This makes the code easier to read and takes away the responsibility of the recv'er to delete the message (plus [reductiontarget]'s are also [nokeep]).

3. Avoid having an extra copy on the sender side when possible. Reduction contributions are sometimes done via a temporary array. This incurs an extra dynamic memory allocation and copy on the sender, since the reduction framework needs to copy that array into a CkReductionMsg. Instead of allocating a temporary array, just allocate a CkReductionMsg and fill its data field with the contribution data directly and then contribute that. See src/ck-perf/trace-projections.C for a couple examples of extra copies.


Related issues

Related to Charm++ - Cleanup #1249: Deprecate setReductionClient and CkSetReductionClient Merged 10/05/2016

History

#1 Updated by Sam White almost 3 years ago

4. Do not use ckSetReductionClient, instead pass the callback to the contribute method. This makes the control flow easier to follow for readers of the code.

#2 Updated by Sam White almost 3 years ago

  • Status changed from New to In Progress

ck-ldb cleanup (except for the custom reduction type 'lbDataCollectionType' in MetaBalancer.C): https://charm.cs.illinois.edu/gerrit/#/c/1898/

#3 Updated by Sam White over 2 years ago

ck-perf/ cleanup (except for 2 custom reductions): https://charm.cs.illinois.edu/gerrit/#/c/1926/

#4 Updated by Karthik Senthil over 2 years ago

ck-perf cleanup update :
The 2 custom reductions in ck-perf/trace-projections.C which can be changed into tuple reductions are outlierReduction and minMaxReduction. These changes would also need some redesign in their corresponding reductiontarget functions i.e globalMetricRefinement and findNextMinMax.

#5 Updated by Phil Miller over 2 years ago

  • Status changed from In Progress to Merged
  • translation missing: en.field_closed_date set to 2016-11-30 11:32:09.824944

#6 Updated by Sam White over 2 years ago

  • % Done changed from 0 to 40
  • Status changed from Merged to In Progress
  • translation missing: en.field_closed_date deleted (2016-11-30 11:32:09.824944)

Only ck-ldb and ck-perf have been looked over and cleaned up. ck-cp, ck-pics, and ck-core remain.

#7 Updated by Sam White over 2 years ago

Another note for optimizing reductions:

If a custom reduction uses a fixed size message (the messages size doesn't grow with each contribution like for CkReduction::set or concat), it should be made 'streamable'. To do that, change the 'addReducer(refucerFn)' call to 'addReducer(reducerFn, true)'.

Also, all custom reductions with fixed sized messages should avoid intermediate memory allocation/copies. So if a reducer function looks like this:

CkReductionMsg* reducerFn(int nMsgs, CkReductionMsg** msgs) {
  double data[MSG_LEN]; // allocate temporary data array
  for (int i=0; i<MSG_LEN; i++) {  // copy the first message's data into temp
     data[i] = (double *)msgs[0]->getData()[i];
  }
  for (int i=1; i<nMsgs; i++) { // processes remaining messages
     for (int j=0; j<MSG_LEN; j++) {
       data[j] += (double *)msgs[i]->getData()[j];
     }
  }
  CkReductionMsg *retMsg = CkReductionMsg::buildNew(MSG_LEN * sizeof(double), data);
  return retMsg;
}

It can be rewritten to operate on the first message in place:

CkReductionMsg* reducerFn(int nMsgs, CkReductionMsg** msgs) {
  double *data = (double *)msgs[0]->getData(); // use msgs[0] directly instead of a temporary array
  for (int i=1; i<nMsgs; i++) { // processes remaining messages
     for (int j=0; j<MSG_LEN; j++) {
       data[j] += (double *)msgs[i]->getData()[j];
     }
  }
  return msgs[0];
}

#8 Updated by Sam White over 2 years ago

  • Target version changed from 6.8.0 to 6.8.1

#9 Updated by Justin Szaday about 2 years ago

I cleaned up all of the uses of `setReductionClient` in the tests except for those in `tests/charm++/megatest/reduction.C`, since they use unusual reductions that I was unsure how to port to a `CkCallback.` I also cleaned up all of the uses in the examples except for those in the `pencilFFT` and `msa` tests – as they were implied to be either deprecated or scheduled to be deprecated.

#10 Updated by Sam White almost 2 years ago

Can you add links to those gerrit patches here?

#11 Updated by Sam White almost 2 years ago

  • Target version changed from 6.8.1 to 6.9.0

#12 Updated by Karthik Senthil almost 2 years ago

  • Target version deleted (6.9.0)

Also available in: Atom PDF