Project

General

Profile

Bug #1830

ChaNGa deadlocks due to recent change to QD types

Added by Thomas Quinn 6 months ago. Updated 6 months ago.

Status:
Merged
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
03/12/2018
Due date:
% Done:

0%


Description

I'm getting a deadlock when running ChaNGa with recent versions of charm. "git bisect" says that the problem commit is:
commit 3137ae773a8af5bd476beafaeee4dabd168062bd
Change-Id: I2aed14aac27ea12b7f5680a602671f95bc78910d
QD: Convert to use more suitable types

Hence I suspect a race condition after moving from ints to CmiUInt8 for the QD counters.

Charm is built with:
./build ChaNGa gni-crayxc hugepages smp -j8 --with-production
using gcc 7.1.0.

History

#1 Updated by Sam White 6 months ago

Does this patch fix the problem? https://charm.cs.illinois.edu/gerrit/#/c/3844/

#2 Updated by Thomas Quinn 6 months ago

I tried changeset 3844, and I still get the hang.

#3 Updated by Sam White 6 months ago

But reverting 2897 makes the problem go away?

https://charm.cs.illinois.edu/gerrit/#/c/2897/

#4 Updated by Thomas Quinn 6 months ago

The problem goes away if I run with commit 8de6719c613f34a3d6b1baccb9aa4aa8f78c12db, the commit before 2897.

#5 Updated by Sam White 6 months ago

  • Target version set to 6.9.0

K we can just revert that change for now then, since I'm not sure how that is causing a problem and it's an easy one to revert.

#7 Updated by Sam White 6 months ago

Could you provide the run command for ChaNGa that reproduces the deadlock? In case we want to pursue the datatype change in the future, we'll need to be able to run this case

#8 Updated by Sam White 6 months ago

Actually, we found another issue in the original patch, I updated this patch to fix that. Can you try ChaNGa on the updated version of this patch? https://charm.cs.illinois.edu/gerrit/#/c/3844/

#9 Updated by Thomas Quinn 6 months ago

Commit 6b1b6d708 fixes the problem.

#10 Updated by Sam White 6 months ago

  • Assignee set to Sam White
  • Status changed from New to Implemented
  • Subject changed from Deadlock: possible race in QD to ChaNGa deadlocks due to recent change to QD types

#11 Updated by Sam White 6 months ago

  • Status changed from Implemented to Merged

Also available in: Atom PDF