Trace MPI_ functions in AMPI
If built with tracing enabled (or perhaps as another option), AMPI should insert user events to all MPI_ routines to mark their beginning and end. This could potentially be added to the functionality of the AMPIAPI macros at the start of every MPI_ function.
#1 Updated by Sam White about 3 years ago
- Subject changed from Trace MPI_ functions in AMPI with user events to Trace MPI_ functions in AMPI
Discussing this with Ronak, realized we should use system events rather than user events like Charm does for entry methods, so that these are separate from user events. Since we have deemed PMPI_ support impractical, this should be bumped up in priority.
We should be able to change AMPIAPI to include tracing stuff at the beginning and end of each AMPI_ routine. AMPIAPI already takes the function name, so we shouldn't even have to change anything other than the definition of AMPIAPI itself.
1. Add a routine that is called during AMPI's startup process (if tracing is enabled) to register every AMPI_ function with the tracing framework.
2. Add a call to start the trace for a particular function to TCharmAPIRoutine's constructor.
3. Add a call to stop the trace for a particular function to TCharmAPIRoutine's destructor.
We can do this with userBracketedEvents, but viewing them in Projections is not very helpful since the events span blocking events: for example, if we run with virtualization and trace a call to MPI_Barrier, we just see one event from the first rank on a PE to reach the barrier until the last rank on that PE exits that barrier.
Instead, we may want to insert calls to stop tracing whenever we block inside AMPI, so that tracing is split phase for such routines.
Basically, look at some AMPI Projections traces in Timeline view and see how we can improve them incrementally.
#10 Updated by Sam White about 2 years ago
- Status changed from Implemented to Merged
2 things to follow up on:
1. Use unordered_map instead of map (use tr1::unordered_map if CMK_USING_XLC before 6.8.0)
2. From your progress report, why are the bracketed user events overlapping in time when it seems like they shouldn't be?
#12 Updated by Sam White about 2 years ago
Also, clean up heap memory allocated for the funcmap. I made a half-baked attempt at that here but it has issues noted here: https://charm.cs.illinois.edu/gerrit/#/c/2463/
To do this properly, the easiest thing might be to have the funcmap be owned by a Node Group, and during the TCharm exit sequence every thread contributes to a reduction over that Node Group, witha callback that broadcasts to the node group and deletes that memory, before continuing on to call CkExit().