Project

General

Profile

Bug #1278

ReductionStarting overhead in SMP mode

Added by Phil Miller over 2 years ago. Updated over 2 years ago.

Status:
Merged
Priority:
Normal
Assignee:
Category:
SMP
Target version:
Start date:
11/02/2016
Due date:
% Done:

0%


Description

On a benchmark where every PE has array elements, I'm still seeing a lot of CkReductionMgr::ReductionStarting messages, taking up a large fraction of runtime during that phase. Given the changes a while back to quiet them when they're not needed, I don't think they should appear at all.

TimelineScreenshot.pdf (103 KB) Phil Miller, 11/02/2016 06:15 PM

UsageProfile.pdf (275 KB) Phil Miller, 11/02/2016 06:15 PM


Related issues

Related to Charm++ - Feature #29: Reduction Starting messages Closed 02/06/2013 02/28/2013
Related to Charm++ - Cleanup #12: Factor out massive duplication in reductions New 02/05/2013
Related to Charm++ - Bug #635: all trees should be pe/node/physnode/network topology aware Merged 12/19/2014

History

#1 Updated by Phil Miller over 2 years ago

#2 Updated by Phil Miller over 2 years ago

The fix in #29 should have covered this

#3 Updated by Phil Miller over 2 years ago

In projections traces, I noticed that each contribution was only triggering sends of ReductionStarting messages to other PEs in the same process. So, I tried a non-SMP build to see what would happen there. I saw no ReductionStarting messages at all, and performance was substantially improved.

There is some optimization missing in the node-level array reduction consolidation to match the non-SMP case.

#4 Updated by Phil Miller over 2 years ago

Harshitha seems to have started on code to fix this in branch harshitha/reductionstartingnode but not finished it.

#5 Updated by Phil Miller over 2 years ago

For note, the code on that branch at least mildly conflicts with the more recent streaming reduction changes, and I don't have the will to dig into how to resolve those conflicts properly. So I don't know if it would truly fix this case.

#6 Updated by Eric Bohm over 2 years ago

  • Assignee set to Eric Mikida

#7 Updated by Sam White over 2 years ago

  • Subject changed from Large fraction of time in CG solver taken by repetitive ReductionStarting to Large fraction of time in CG solver taken by repetitive ReductionStarting in SMP builds

#8 Updated by Phil Miller over 2 years ago

  • Subject changed from Large fraction of time in CG solver taken by repetitive ReductionStarting in SMP builds to SMP builds: Large fraction of time in CG solver taken by repetitive ReductionStarting

#9 Updated by Phil Miller over 2 years ago

Unconditionally defining GROUP_LEVEL_REDUCTION to 1 in ckreduction.h quiets this down, at the cost of some efficiency in the reduction itself - the tree becomes the usual 4-ary tree over PEs, rather than taking nodes into account.

#10 Updated by Sam White over 2 years ago

  • Subject changed from SMP builds: Large fraction of time in CG solver taken by repetitive ReductionStarting to ReductionStarting overhead in SMP mode

In Core we discussed the possibility of passing CkArrayOptions through to the reduction managers just like what is already done in CkArrayBroadcaster (called 'stableLocations' there).

#11 Updated by Sam White over 2 years ago

  • Category set to SMP

#12 Updated by Eric Mikida over 2 years ago

Implemented in https://charm.cs.illinois.edu/gerrit/2216 by making CkReductionMgr node aware when in SMP mode.

#13 Updated by Eric Mikida over 2 years ago

  • Status changed from New to Implemented

Implemented in [[https://charm.cs.illinois.edu/gerrit/2216]] by making CkReductionMgr node aware when in SMP mode.

#14 Updated by Sam White over 2 years ago

The above commit leaves a TODO behind for the following:

  // TODO: This is currently gotten from the command-line...but it could
  // probably just be determined at runtime. I don't know if the CL option is
  // even documented anywhere.
  if(disableNotifyChildrenStart) return;

Documenting here so it is not forgotten

#15 Updated by Phil Miller over 2 years ago

  • Status changed from Implemented to Merged
  • translation missing: en.field_closed_date set to 2017-02-15 13:41:03.324523

And, merged. Finally.

#16 Updated by Phil Miller about 2 years ago

  • Related to Bug #635: all trees should be pe/node/physnode/network topology aware added

Also available in: Atom PDF