Project

General

Profile

Feature #1394

Node-level message aggregation for CkMulticast

Added by Juan Galvez almost 2 years ago. Updated about 1 year ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
02/02/2017
Due date:
% Done:

0%


Description

Because CkMulticastMgr is a group, it uses a tree structure of PEs to send group messages. The problem is that if one of the PEs in the tree is busy with something, it won't process multicast messages that could be processed by other PEs in the same node.

Solution is to convert CkMulticastMgr to a nodegroup. Trees should be of logical nodes (processes) instead. Ideally, the spanning tree algorithm will also be physical-node aware when topology information is present.

History

#1 Updated by Juan Galvez almost 2 years ago

  • Target version changed from 6.8.1 to 6.8.0

#2 Updated by Juan Galvez almost 2 years ago

  • Status changed from New to In Progress

#3 Updated by Sam White almost 2 years ago

  • Subject changed from Make CkMulticastMgr a nodegroup to Node-level message aggregation for CkMulticast

Core decided that a Node Group should be added on top of the current Group CkMulticastMgr

#4 Updated by Phil Miller over 1 year ago

  • Target version changed from 6.8.0 to 6.8.1

This won't be an API change, AFAICT, so it could be done in a patch release.

#5 Updated by Sam White over 1 year ago

  • Target version changed from 6.8.1 to 6.9.0

Any update on this?

#6 Updated by Juan Galvez about 1 year ago

Currently debugging this on Blue Waters.

#7 Updated by Juan Galvez about 1 year ago

This is crashing on BW with 64 nodes.

The dependency chain for building CkArray group is locMgr->mcastMgr->array. Apparently the crash is due to nodegroup dependencies not existing (are ignored). So, because mcastMgr is in the middle of dependency chain the end result is that there is NO dependency being enforced for creation.

#8 Updated by Juan Galvez about 1 year ago

  • Target version changed from 6.9.0 to Unscheduled

Respecting the dependencies during creation seems to solve problems. Performance still needs to be tuned.

But nodegroup dependencies support does not exist yet in main charm branch, and merging a good solution will probably take some time.

Also available in: Atom PDF