Project

General

Profile

Bug #159

Some CkCallback types are not valid across checkpoint/restart

Added by Phil Miller over 5 years ago. Updated 9 months ago.

Status:
New
Priority:
Low
Assignee:
Category:
-
Target version:
Start date:
04/02/2013
Due date:
% Done:

0%


Description

Per #158, many types of CkCallback contain (possibly transitively, through other structures) raw pointers to objects in the system, like chares (via CkChareID) and functions. These callbacks cannot survive recovery from the kind of application-level checkpoints that Charm++ performs, because their targets may have changed in address from one execution to the next. In the chare case, we can potentially use a less transient identifier like chareIdx if that's stable and usable across restart. If chares get folded into the fixed-size global object ID work (#108), then that will apply to callbacks as well, and this will be fixed.

I'm less sure how to handle functions. It might be possible to have them registered explicitly and referenced by some ID instead of by pointer, but I'm uncertain whether that would actually work in the restart case either, unless the registration were in some very low-level code run at every process launch. If initnode calls happen even during restart, then that may suffice, but whoever works on this would have to check this pretty carefully.

History

#1 Updated by Eric Bohm over 5 years ago

  • Assignee set to Jonathan Lifflander

Parcel out sub components of this task as needed.

#2 Updated by Phil Miller about 5 years ago

  • Target version changed from 6.6.0 to 6.7.0

#3 Updated by Eric Bohm almost 4 years ago

  • Priority changed from Normal to Low

#4 Updated by Phil Miller about 3 years ago

  • Target version changed from 6.7.0 to 6.8.0

#5 Updated by Phil Miller about 2 years ago

  • Assignee changed from Jonathan Lifflander to Phil Miller

#6 Updated by Phil Miller over 1 year ago

  • Target version changed from 6.8.0 to 6.8.1

Not seeming to affect any current applications, so deferring.

#7 Updated by Eric Bohm over 1 year ago

  • Target version changed from 6.8.1 to 6.9.0

#8 Updated by Phil Miller about 1 year ago

Eric, Ronak: what's the status of using 64-bit IDs to name plain chares?

#9 Updated by Eric Mikida about 1 year ago

I did some exploration to get this integrated, and to get singleton chares ID fully updated to 64bit ID would take a lot of work due to the number of different chare IDs already used in various different places and the fact that they aren't even always used as just pure IDs. A quick and dirty fix to get 64bit IDs for every singleton chare is more doable, but I'm not sure how worthwhile it would be, and may necessitate multiple API changes to add another ID to a chare as a temporary fix, and then later re-update the API to remove the other obsolete IDs.

If this particular bug is critical, the plain IDs could be added quickly to (maybe) address this if it is worth it.

Ronak may have more input?

#10 Updated by Phil Miller 12 months ago

  • Assignee deleted (Phil Miller)

#11 Updated by Eric Bohm 11 months ago

  • Target version changed from 6.9.0 to 6.9.1

#12 Updated by Eric Bohm 9 months ago

  • Assignee set to Juan Galvez

Also available in: Atom PDF