Project

General

Profile

Bug #1470

Investigate broken load balancers in mini-apps

Added by Michael Robson 10 days ago. Updated 5 days ago.

Status:
New
Priority:
Normal
Category:
Load Balancing
Target version:
-
Start date:
03/16/2017
Due date:
% Done:

0%


Description

Excerpt from an external email sent by Debashis Ganguly:

I am able to run leanmd mini-app with 5 different load balancers in SMP mode. However, I am unable to run any other with any load balancer.

In the Charm++ website, it is mentioned that AMR has an automatic feature to support load balancers. However, when I try to run AMR it aborts with an error "Cannot insert array element twice!". This is the same with jacobi2d under the AMR within examples. This led me to believe it has something to do with AMR library under ck-libs. Whereas, jacobi2D under examples folder upon running throws memory corruption error.

I also had downloaded lulesh from LLNL website. Unfortunately, it is compatible with earlier version of Charm++. It doesn't print any debug message when run with the +LBDebug option. There is no way to trace whether load balancer is working or not. Moreover, with and without any load balancer, the performance is same.

I have also tried running wave2d this gives segmentation fault after running for awhile.

History

#1 Updated by Michael Robson 10 days ago

From Kavitha:

For the amr/jacobi2d example, I get the same error as Debashis, for current charm branch. It seems like it could be an old error https://lists.cs.illinois.edu/lists/arc/charm/2010-05/msg00034.html .

#2 Updated by Sam White 10 days ago

I think we agreed the AMR library could be removed from mainline charm entirely. If someone wants to do LB with AMR they should check out the AMR mini-app, not the library.

LULESH and wave2d should both be investigated and fixed.

#3 Updated by Sam White 6 days ago

  • Assignee set to Kavitha Chandrasekar
  • Category set to Load Balancing

#4 Updated by Kavitha Chandrasekar 5 days ago

Lulesh can be run with load balancing with a few minor changes like updating uses of CmiTrue and atomic. AtSync() calls are commented out by default, hence to invoke the load balancer we need to add AtSync() calls.

For wave2d, it might be good to invoke load balancing in At Sync mode instead of using Periodical LB. It would also be good to investigate the load imbalance in the example.

Also available in: Atom PDF