Project

General

Profile

Activity

From 02/16/2017 to 03/17/2017

03/16/2017

05:23 PM Bug #1471 (New): Parallel Prefix No Barrier Example in Charm Tutorial Hangs on MPI Layer
As pointed out on the charm mailing list, the parallel prefix no barrier example hangs when run on the mpi layer. I w... Michael Robson
04:52 PM Bug #1470: Investigate broken load balancers in mini-apps
I think we agreed the AMR library could be removed from mainline charm entirely. If someone wants to do LB with AMR t... Sam White
04:39 PM Bug #1470: Investigate broken load balancers in mini-apps
From Kavitha:
For the amr/jacobi2d example, I get the same error as Debashis, for current charm branch. It seems l...
Michael Robson
04:37 PM Bug #1470 (Closed): Investigate broken load balancers in mini-apps
Excerpt from an external email sent by Debashis Ganguly:
I am able to run leanmd mini-app with 5 different load b...
Michael Robson
02:34 PM Cleanup #1203: AMPI forces builds to be serial for ROMIO
Another annoyance with our build our ROMIO is if you build AMPI+ROMIO, then edit AMPI and do 'make AMPI' in the build... Sam White
02:15 PM Feature #1469 (Merged): Don't require migration constructors for all array objects at compile time
Phil Miller
02:14 PM Bug #484 (Merged): Topology aware spanning trees broken on XE6
Phil Miller

03/15/2017

04:41 PM Feature #1469: Don't require migration constructors for all array objects at compile time
BGQ XLC is happy enough with the fall-back option
BGQ GCC handles the new version cleanly
Manual updated to ref...
Phil Miller
04:27 PM Feature #1469: Don't require migration constructors for all array objects at compile time
BGQ XLC test *failed*. It will not accept the @#include <type_traits>@ as currently configured, because GCC's libstdc... Phil Miller
03:58 PM Feature #1469: Don't require migration constructors for all array objects at compile time
https://charm.cs.illinois.edu/gerrit/2313 Phil Miller
03:57 PM Feature #1469 (Merged): Don't require migration constructors for all array objects at compile time
Currently, every chare array element type is required to have a @CkMigrateMessage*@ constructor that would be used du... Phil Miller
02:17 PM Bug #1447 (Merged): AMPI_Reduce is broken for derived datatypes in user-defined reductions
Sam White
12:12 PM Bug #1341: AMPI fails to link on mpi-win64
Bump. With the following merged, these builds should be the only ones that are persistently failing on AMPI now: http... Sam White
12:10 PM Feature #1457 (Merged): Default option to choose between isomalloc and os-isomalloc
Sam White
11:25 AM Feature #1457 (Implemented): Default option to choose between isomalloc and os-isomalloc
Make charmc use os-isomalloc on Clang non-SMP, with a warning to the user: https://charm.cs.illinois.edu/gerrit/#/c/2... Sam White

03/14/2017

06:40 PM Documentation #1265: Document LLVM OpenMP runtime integration
The existing @-openmp@ flag should backend to the integrated support when the runtime has been built with the @omp@ o... Phil Miller
03:55 PM Feature #1237 (In Progress): Onesided sender side implementation for GNI layer
Phil Miller
03:41 PM Feature #1237: Onesided sender side implementation for GNI layer
Since there's still several things to fix up here, and we want to get the beta out the door, we're deferring this to ... Phil Miller
03:55 PM Feature #1234 (Merged): Avoid sender-side copy for large contiguous messages. API for charm and c...
API and some machine layer implementations merged. Other machine layers are still pending, but we want to check the b... Phil Miller
03:39 PM Bug #1148 (Merged): Define 'thisIndex' for Groups
Phil Miller
07:41 PM Bug #1148 (In Progress): Define 'thisIndex' for Groups
Karthik Senthil
03:38 PM Bug #1430 (Merged): CthThread tracing broken - threads show up as black, dummy_thread_chare assig...
Phil Miller
01:48 PM Bug #1453 (Merged): Branching factor for group sections is not handled correctly.
Fix was merged over a week ago Phil Miller
12:36 PM Feature #1238 (Merged): Onesided sender side implementation for Verbs layer
Phil Miller
12:12 PM Bug #901 (In Progress): Threads awoken by CthAwaken don't let Projections trace back to the event...
Phil Miller

03/13/2017

11:48 AM Bug #1430 (Implemented): CthThread tracing broken - threads show up as black, dummy_thread_chare ...
https://charm.cs.illinois.edu/gerrit/#/c/2302/
Full revert for now, since the affected functionality is more critica...
Phil Miller
01:39 AM Feature #1458 (Implemented): Zero-copy send support for the MPI machine layer
Gerrit Link: https://charm.cs.illinois.edu/gerrit/#/c/2299/3 Nitin Bhat

03/11/2017

07:15 PM Bug #1148 (Implemented): Define 'thisIndex' for Groups
Patch on Gerrit: https://charm.cs.illinois.edu/gerrit/#/c/2298/ Karthik Senthil

03/10/2017

01:12 PM Feature #1468 (Merged): Enable pre-pinning memory for the zero-copy message sends through the Ent...
The cost of memory pinning on Verbs and GNI is high, so we'd like to enable users to pre-pin memory for use in later ... Sam White
11:25 AM Bug #1148: Define 'thisIndex' for Groups
The idea behind making it static was that we could save on memory overhead by having one instance of it. We can discu... Sam White
08:12 PM Bug #1148 (In Progress): Define 'thisIndex' for Groups
Why should @thisIndex@ be defined as a static variable?
I have a current implementation which adds @thisIndex@ as ...
Karthik Senthil
11:16 AM Feature #1467 (Rejected): Avoid memory pinning overhead for RDMA sends within a process
If an RDMA message is being sent to another object in the same process, we already do a direct memcpy rather than an ... Sam White
11:12 AM Feature #969: AMPI support for collectives on inter-communicators
https://charm.cs.illinois.edu/gerrit/#/c/2084/7 contains a consolidated lump of
> - bcast, ibcast
> - barrier, i...
Phil Miller
11:08 AM Feature #969: AMPI support for collectives on inter-communicators
Gather/Igather in progress:
https://charm.cs.illinois.edu/gerrit/2183
Phil Miller
10:59 AM Feature #969: AMPI support for collectives on inter-communicators
Bcast/Ibcast
https://charm.cs.illinois.edu/gerrit/#/c/2127/9
Phil Miller
10:58 AM Feature #969: AMPI support for collectives on inter-communicators
Scatter/Iscatter:
https://charm.cs.illinois.edu/gerrit/#/c/2285/2
Phil Miller
10:56 AM Feature #969: AMPI support for collectives on inter-communicators
Turn enumeration into a checklist, to test out the plugin and track completion mechanically. Phil Miller

03/09/2017

02:49 PM Bug #1462: Programs hang at startup with CUDA build
There is a mempool initialization step that does a bunch of @cudaMallocHost@ calls (which I believe has nothing to do... Jaemin Choi
09:20 PM Bug #1462: Programs hang at startup with CUDA build
You noticed this line, right?... Jim Phillips
11:49 AM Feature #921: Entry tag [inline] is unable to optimize away most of the overhead
Comment 18's definition of the rvalue conversion operator was missing @t = space;@ in the non-owned case Phil Miller
09:20 AM Documentation #1466 (Merged): Update list of available load balancing strategies in the manual
The LB section of the Charm++ manual has a list of available load balancing strategies, which is missing some.
In pa...
Sam White
12:57 AM Feature #1465 (Implemented): Spanning Tree implementation for scatterv
Patch: https://charm.cs.illinois.edu/gerrit/#/c/2296/ Vipul Harsh
09:30 PM Feature #1465 (Implemented): Spanning Tree implementation for scatterv
Use a spanning tree implementation for optimised scatterv Vipul Harsh

03/08/2017

04:36 PM Support #1079: Remove deprecated machine layers and retired machines from Autobuild
Here's some machine layers that are still around that I think we might be able to remove completely:... Sam White
04:21 PM Bug #1464: CUDA example programs hang when run with 1 PE
Thought it might be because of the handler functions indices, so wrapped and moved out @CmiRegisterHandler()@ calls t... Jaemin Choi
12:34 AM Bug #1464 (Closed): CUDA example programs hang when run with 1 PE
CUDA example programs (@overlapTest@, @concurrentKernels@, @callbacks@, etc.) hang when they are run with only 1 PE.
...
Jaemin Choi
03:49 PM Bug #815: Makefile for hybrid API is not using the system OPTS
Reverted for now, working on a long-term solution Michael Robson
03:20 PM Bug #1462 (In Progress): Programs hang at startup with CUDA build
Exactly same code (@examples/charm++/hello/1darray@) with 4 PEs crashes on @nano6@ but runs fine on @nano7@.
GPUMana...
Jaemin Choi
08:21 AM Bug #1462: Programs hang at startup with CUDA build
There is no way cudaStreamCreate should hang due to GPU load so this would be good to track down. All I can think is... Jim Phillips
12:04 AM Bug #1462 (Closed): Programs hang at startup with CUDA build
This was due to the GPU being heavily used by other processes.
@nvidia-smi -q@ shows the current usage.
When tested...
Jaemin Choi
06:19 PM Bug #1462: Programs hang at startup with CUDA build
The culprit is @cudaStreamCreate()@ in @GPUManager::initHybridAPIHelper()@. No idea why this is the cause yet, becaus... Jaemin Choi

03/07/2017

05:45 PM Bug #1462 (In Progress): Programs hang at startup with CUDA build
Problem seems to be in @initHybridAPI()@ in @ck-core/init.C@, because programs run fine if I comment this out along ... Jaemin Choi
05:20 PM Bug #1462 (Closed): Programs hang at startup with CUDA build
When using the CUDA build of Charm++, example programs located both under @examples/charm++/cuda@ hang at startup.
...
Jaemin Choi
04:06 PM Bug #1440: smp pes sending messages still block due to other send activity
Re-assigning it to Karthik since he's going to do the projections/performance tests. Bilge Acun

03/06/2017

05:35 PM Feature #1303 (In Progress): Implement MPI-R debugging hooks to support Allinea DDT and Rogue Wav...
Phil Miller
01:15 PM Bug #1410: Tuple reducer leaks memory when using set/concat/custom reducers
Fix for tuple/set reductions over node groups: https://charm.cs.illinois.edu/gerrit/#/c/2291/ Sam White
12:01 PM Feature #1235 (Merged): Onesided sender side implementation for PAMI layer
Phil Miller
11:53 AM Feature #1459 (New): Zero-copy send support for the netlrts machine layer
In the netlrts machine layer, it's pretty easy to stream data from an arbitrary address to a remote recipient on requ... Phil Miller
11:51 AM Feature #1458 (Merged): Zero-copy send support for the MPI machine layer
As Jim noted, it's straightforward for MPI to do in-place sends, and newer versions actually allow RMA, so we should ... Phil Miller
08:25 AM Bug #1445 (Merged): mpi-crayxc failure in megampi
Fixed by making context threads the default on mpi-crayxc: https://charm.cs.illinois.edu/gerrit/#/c/2282/ Sam White

03/05/2017

01:06 PM Documentation #1265: Document LLVM OpenMP runtime integration
bump Sam White
10:07 AM Bug #1447 (Implemented): AMPI_Reduce is broken for derived datatypes in user-defined reductions
Oops, not yet Sam White
10:06 AM Bug #1447 (Merged): AMPI_Reduce is broken for derived datatypes in user-defined reductions
Sam White
08:27 PM Bug #1447 (Implemented): AMPI_Reduce is broken for derived datatypes in user-defined reductions
Implemented user-defined reductions on non-contiguous derived datatypes using the non-commutative reduction code path... Sam White
10:06 AM Feature #1457 (Merged): Default option to choose between isomalloc and os-isomalloc
-memory isomalloc does not work on Clang non-SMP builds. -memory os-isomalloc does not work on SMP builds, so add a n... Sam White

03/04/2017

06:07 PM Feature #1456 (Feedback): Add more stream callbacks for use after HToD transfer and kernel execution
Patch uploaded to gerrit for review.
[[https://charm.cs.illinois.edu/gerrit/#/c/2288/]]
Jaemin Choi

03/03/2017

04:25 PM Bug #1453 (Implemented): Branching factor for group sections is not handled correctly.
https://charm.cs.illinois.edu/gerrit/#/c/2287 Vipul Harsh
06:37 PM Bug #1453 (Merged): Branching factor for group sections is not handled correctly.
Vipul Harsh
02:34 PM Cleanup #1454: GPUManager API change
Buffer ID (-1) should be last param and set to -1 by default
Also, is there a way to mark copy both ways?
ints ...
Michael Robson
12:48 PM Cleanup #1454 (Feedback): GPUManager API change
Change pushed to gerrit for review.
[[https://charm.cs.illinois.edu/gerrit/#/c/2283/]]
Jaemin Choi
12:06 PM Cleanup #1454 (Merged): GPUManager API change
Making changes to current GPUManager API to provide a more uniform & segregated API (function calls now start with ha... Jaemin Choi
01:30 PM Feature #1456 (Merged): Add more stream callbacks for use after HToD transfer and kernel execution
The user may have use for callbacks that occur after completion of host-to-device data transfer and kernel execution,... Jaemin Choi
12:54 PM Bug #1424 (Merged): Improve performance of randomized message queueing
Phil Miller
11:38 AM Bug #1288 (Rejected): net-cygwin: megacon crashes when blkinhand and posixth are enabled
net-cygwin is deprecated Sam White
11:38 AM Bug #1287 (Rejected): net-cygwin crashes in megatest/{queens,tempotest,synctest} under --with-pro...
net-cygwin is deprecated Sam White
11:37 AM Bug #1291 (Rejected): net-cygwin crashes in examples/bigsim/sdag
net-cygwin is deprecated Sam White
11:37 AM Bug #1290 (Rejected): net-cygwin crashes in examples/bigsim/emulator
net-cygwin is deprecated Sam White
09:46 AM Feature #1234: Avoid sender-side copy for large contiguous messages. API for charm and converse l...
API patch merged:
https://charm.cs.illinois.edu/gerrit/1273
As expected, no breakage in autobuild resulted.
Phil Miller
06:18 PM Bug #1452: verbs-linux-ppc64le xlC
Do any PPLers have access to summit-dev? If so assign to one of them. Sam White

03/02/2017

05:20 PM Bug #1452 (Merged): verbs-linux-ppc64le xlC
Need to support verbs (vs net-ibverbs) with xlC on Summit-dev.
Hard to tell people to try verbs because net-ibverbs ...
Jim Phillips
01:40 PM Bug #1447: AMPI_Reduce is broken for derived datatypes in user-defined reductions
Abort: https://charm.cs.illinois.edu/gerrit/#/c/2280/
I can probably get to the 'gather' implementation tomorrow o...
Sam White
01:31 PM Bug #1447: AMPI_Reduce is broken for derived datatypes in user-defined reductions
As noted in meeting, there are two potential quick fixes to at least get codes hitting this case running or failing c... Phil Miller
01:26 PM Bug #1329 (Merged): Hang in exit in TRAM test on gni-crayxc-smp
Karthik Senthil

03/01/2017

03:18 PM Feature #109: Test and merge section ID and manager work
What got merged as a result of this? Is it actually moot due to other changes? Phil Miller
10:29 AM Bug #1430: CthThread tracing broken - threads show up as black, dummy_thread_chare assigned to fo...
Tested 2269, threads trace back to where they were created rather than where they were awakened. Jim Phillips
06:46 PM Feature #1451 (Implemented): NVTX integration for profiling
Uploaded to gerrit for review.
[[https://charm.cs.illinois.edu/gerrit/#/c/2276/]]
Jaemin Choi
06:03 PM Feature #1451 (Merged): NVTX integration for profiling
Integrating NVIDIA Tools Extension Library (NVTX) in GPUManager for profiling CUDA behavior. Jaemin Choi
06:20 PM Bug #1341: AMPI fails to link on mpi-win64
Can you look at this this week, and if you can't fix it easily it then implement a workaround so that we don't test A... Sam White
06:07 PM Feature #1393: Redesign of Hybrid API (GPU Manager) to support concurrent kernel execution
Previous gerrit commit split into multiple smaller ones.
New gerrit commit: [[https://charm.cs.illinois.edu/gerrit/#...
Jaemin Choi
06:06 PM Feature #1450 (Resolved): Clean up and add CUDA example programs
On gerrit for review.
[[https://charm.cs.illinois.edu/gerrit/#/c/2275/]]
Jaemin Choi

02/28/2017

02:29 PM Feature #1450 (Merged): Clean up and add CUDA example programs
CUDA example programs under @examples/charm++/cuda@ need to be refactored with the new design of GPUManager.
Especia...
Jaemin Choi
01:31 PM Bug #1445: mpi-crayxc failure in megampi
The stack trace looks like a failure during a context switch but is pretty opaque. What is weird is that the other AM... Sam White
01:26 PM Bug #1445: mpi-crayxc failure in megampi
The following is a trace of the crash on Edison:... Karthik Senthil
01:28 PM Feature #1449 (New): AMPI support for MPI_Win_allocate_shared
We already have an application of interest that uses this. Supporting its basic functionality is trivial: just alloca... Sam White

02/27/2017

04:28 PM Bug #1448 (Merged): Potential buffer overflows in fscanf()
There is a potential buffer overflow in mylogin() when having a long username ... Matthias Diener
03:22 PM Bug #1430 (In Progress): CthThread tracing broken - threads show up as black, dummy_thread_chare ...
https://charm.cs.illinois.edu/gerrit/2269 should clear things up for the non-[threaded] entry method case. I'll still... Phil Miller
03:11 PM Bug #1430: CthThread tracing broken - threads show up as black, dummy_thread_chare assigned to fo...
Actually, while potentially nice, I realize that solution won't be acceptable for AMPI, because apparently thread lis... Phil Miller
03:08 PM Bug #1430: CthThread tracing broken - threads show up as black, dummy_thread_chare assigned to fo...
Jim, could you try things out with mainline charm (without that change reverted) with added calls to traceAddThreadLi... Phil Miller
11:21 AM Bug #1447 (Merged): AMPI_Reduce is broken for derived datatypes in user-defined reductions
Users are allowed to use derived datatypes in custom reduction functions, but AMPI does not support that and might si... Sam White

02/26/2017

02:28 PM Feature #1446 (Merged): AMPI support for generalized requests
The newest version of ROMIO uses grequests to implement NBC IO routines.
This will require changes throughout all of...
Sam White
11:37 AM Bug #1312: Deleting an array disables reclamation for all arrays bound to that location manager
I think this should be prioritized for merge before the release. Sam White
11:23 AM Bug #1329: Hang in exit in TRAM test on gni-crayxc-smp
Debugged the test case with various setups. Following are some notes from the experiments :
* The hang is not relate...
Karthik Senthil
10:14 AM Bug #1445 (Merged): mpi-crayxc failure in megampi
megampi fails on mpi-crayxc build on Edison on the first run with multiple PEs, with +p2 +vp2. Possibly due to migrat... Sam White

02/24/2017

05:15 PM Bug #1205: AMPI's -tlsglobals option is only supported by GCC
Nope, never mind. No matter what TLS model is in use, it generates instructions that access an offset from the segmen... Phil Miller
05:07 PM Bug #1205: AMPI's -tlsglobals option is only supported by GCC
So, first note is that on LLVM (as of today) the flag in question is only relevant for the "initial exec" and "local ... Phil Miller
01:37 PM Bug #1205: AMPI's -tlsglobals option is only supported by GCC
On a related note, Clang's -femulated-tls option may be useful for us. Phil Miller
09:42 AM Bug #1429 (Merged): AMPI failure after Isomalloc migration on multicore/SMP darwin builds
Phil Miller
07:12 PM Bug #1429 (Implemented): AMPI failure after Isomalloc migration on multicore/SMP darwin builds
Sam White

02/23/2017

03:36 PM Bug #1429: AMPI failure after Isomalloc migration on multicore/SMP darwin builds
I took a first stab at the problem by explicitly providing new/delete functions on darwin.
https://charm.cs.illino...
Matthias Diener
06:31 PM Bug #1429: AMPI failure after Isomalloc migration on multicore/SMP darwin builds
After some more debugging, it seems libc++ of clang is calling malloc after all, but this call does not get redirecte... Matthias Diener
12:06 PM Feature #1444: General Implementation for Serializing Enums
Thanks for your patch, Nils. If you'd like, you can directly submit it for code review and merge at https://charm.cs... Ronak Buch
08:01 AM Feature #1433: lambda syntax for CkLoop
Prototype implementation attached.
Missing reductions (more complex reductions should be supported anyway).
Doe...
Jim Phillips
07:47 AM Feature #1234: Avoid sender-side copy for large contiguous messages. API for charm and converse l...
It would be good to have an MPI implementation as well. Should be easy, since MPI 1.0 supported sending data in place. Jim Phillips

02/22/2017

05:14 PM Feature #1444: General Implementation for Serializing Enums
Yes, that makes sense. I'm really glad to hear full C++11 support will be required! We are currently using C++11 and ... Nils Deppe
05:11 PM Feature #1444: General Implementation for Serializing Enums
Thanks for pointing this out. After the release of Charm++ 6.8.0 (in the coming weeks), we will begin requiring full ... Sam White
03:02 PM Feature #1444 (Merged): General Implementation for Serializing Enums
Currently it's not possible to serialize enums in general. However, by using C++11's @std::is_enum@ this can be done ... Nils Deppe
05:13 PM Bug #1429: AMPI failure after Isomalloc migration on multicore/SMP darwin builds
The above is seemingly true only on darwin. The workaround fix for jacobi.C (use malloc instead of new) would be good... Sam White
04:31 PM Bug #1429: AMPI failure after Isomalloc migration on multicore/SMP darwin builds
clang 's new operator seems to not use malloc(). Therefore, the malloc is not intercepted by isomalloc, resulting in ... Matthias Diener
12:16 PM Bug #1429: AMPI failure after Isomalloc migration on multicore/SMP darwin builds
It crashes with a segfault when running with +p2 instead of +p3. Matthias Diener
02:56 PM Bug #1443 (Merged): Serialization for std::unique_ptr Fails With Abstract Base Class
Attempting to serialize a std::unique_ptr fails when serializing an abstract base class with the error:... Nils Deppe
02:17 PM Bug #1442: CkLoop fixed tree limits helper recruitment
Proof-of-concept patch attached. With this change I see the expected behavior, with all PEs able to steal work.
Iss...
Jim Phillips
12:00 PM Bug #1442 (Merged): CkLoop fixed tree limits helper recruitment
CkLoop uses a tree of branching 4 when CkMyNodeSize() >= 8. This means that if you have ppn=9 and rank 0 calls CkLoo... Jim Phillips

02/21/2017

04:03 PM Bug #1441: Lies at startup about "The comm. thread both sends and receives messages"
More specifically, verbs and net-ibverbs only use the comm thread for receives. Jim Phillips
02:58 PM Bug #1441 (Merged): Lies at startup about "The comm. thread both sends and receives messages"
There is logic for setting and reporting Cmi_smp_mode_setting at startup, but this only reflects the behavior of the ... Jim Phillips
02:52 PM Bug #1440: smp pes sending messages still block due to other send activity
On further research it appears that verbs/net-ibverbs only uses the comm thread for receiving so there is clearly som... Jim Phillips
02:40 PM Bug #1438: off-node messages show created by receiving node comm thread
Yes, Timeline.
Hovering shows "Created by: CommP (N0)".
Details window shows "CREAYED BY: Processor 220" (for a 220...
Jim Phillips
12:25 PM Bug #1438: off-node messages show created by receiving node comm thread
I assume this is about what's seen in Projections Timeline, and not some other tool.
Does the point on the comm th...
Phil Miller
12:39 PM Bug #484: Topology aware spanning trees broken on XE6
Fix here:
https://charm.cs.illinois.edu/gerrit/#/c/2078/
But I'll just note that this fix is probably irrelevant ...
Juan Galvez
12:29 PM Bug #1149 (Merged): Cray CC builds are broken
Corresponding fixes for XE systems in bug #1404 here:
https://charm.cs.illinois.edu/gerrit/#/c/2252/
Seems like a...
Phil Miller
12:23 PM Bug #1329: Hang in exit in TRAM test on gni-crayxc-smp
Fix in progress here:
https://charm.cs.illinois.edu/gerrit/#/c/2236/
Phil Miller
12:18 PM Bug #1397 (Rejected): Document that array creation must occur on PE0
Unless we hear otherwise, this wasn't actually a problem that was occurring. Something else was conflated with this e... Phil Miller
10:24 PM Bug #1430: CthThread tracing broken - threads show up as black, dummy_thread_chare assigned to fo...
This also renders AMPI Projections traces almost useless Sam White

02/20/2017

02:01 PM Bug #1439: net-linux-x86_64-ibverbs-smp-iccstatic with tracing or debug enabled segfaults with ++...
Patch attached. Jim Phillips
01:28 PM Bug #1439: net-linux-x86_64-ibverbs-smp-iccstatic with tracing or debug enabled segfaults with ++...
Cannot build with "--disable-ccs" as a workaround because charmrun fails to compile:... Jim Phillips
12:57 PM Bug #1439: net-linux-x86_64-ibverbs-smp-iccstatic with tracing or debug enabled segfaults with ++...
This issues was known and fixed in lrts last spring:
https://charm.cs.illinois.edu/gerrit/gitweb?p=charm.git;a=commi...
Jim Phillips
12:33 PM Bug #1376 (Merged): AMPI_Ireduce only creates a request at the root
Phil Miller
10:23 AM Bug #1440 (New): smp pes sending messages still block due to other send activity
Despite #642 being merged I still see pes that are sending messages blocked waiting on some shared resource in smp bu... Jim Phillips

02/19/2017

03:48 PM Bug #1439 (Merged): net-linux-x86_64-ibverbs-smp-iccstatic with tracing or debug enabled segfault...
verbs-linux-x86_64-iccstatic works fine. Trying to do a performance comparison.
net-linux-x86_64-ibverbs-smp-iccs...
Jim Phillips

02/18/2017

02:19 PM Feature #1321: multiple communication threads per process
Multiple communication threads per process would be good for distributing Charm++ internal work, but will not help PS... Jim Phillips
02:08 PM Feature #873: send messages from non-PE threads
I observe that #1393 seems to have solved this issue for GPUManager. Jim Phillips
02:04 PM Projections Feature #698: trace messages forward through communication threads
Can this get a target version please? Jim Phillips
01:26 PM Feature #1234: Avoid sender-side copy for large contiguous messages. API for charm and converse l...
Assuming pup methods are available for the data, it should be possible to run the pup in a Cthread with the data goin... Jim Phillips
01:12 PM Feature #1040: support multiple InfiniBand cards per node
Confirming that both HCAs connect to a single network (rather than two parallel networks):... Jim Phillips
12:47 PM Feature #1040: support multiple InfiniBand cards per node
Target platform is now Summitdev:
https://www.olcf.ornl.gov/kb_articles/summitdev-quickstart/#Hardware
If we end ...
Jim Phillips
07:34 PM Feature #1393 (Implemented): Redesign of Hybrid API (GPU Manager) to support concurrent kernel ex...
Up for gerrit review. Jaemin Choi

02/17/2017

03:05 PM Bug #901: Threads awoken by CthAwaken don't let Projections trace back to the event that woke them
With change 989 reverted I now see correct tracing for Cthreads. Jim Phillips
12:35 PM Bug #901: Threads awoken by CthAwaken don't let Projections trace back to the event that woke them
Jim: Could you try things out with the above change reverted? Looking at the code again, I realize I may have actuall... Phil Miller
12:21 AM Bug #901: Threads awoken by CthAwaken don't let Projections trace back to the event that woke them
Also, threads are created by CthCreate(), not as [threaded] entries. Jim Phillips
11:59 PM Bug #901: Threads awoken by CthAwaken don't let Projections trace back to the event that woke them
This does not appear to work for threads as used in NAMD where threads are awoken from regular entries rather than ot... Jim Phillips
02:39 PM Bug #1438 (Merged): off-node messages show created by receiving node comm thread
Entries for messages sent by other nodes show that the are created by the communication thread of the receiving node ... Jim Phillips
02:35 PM Bug #1437 (Merged): CkLoop worker traces to previous entry on pe rather than caller
CkLoop work shows up as ckloop_converse_chare::CkLoop with "created by" and "executed on" both correct, but tracing f... Jim Phillips
02:23 PM Feature #1436 (In Progress): trace CcdCallFnAfter() causality
NAMD uses CcdCallFnAfter() to poll for GPU completion, which breaks the causality chain. It should be possible to tr... Jim Phillips
01:42 PM Feature #1435 (New): Collapse prioritized msg buckets into priority queue
The msgq keep a mapping between priority values and msg buckets, which would be unnecessary if we collapsed the bucke... Sam White
12:51 PM Feature #1434 (Merged): optimize degenerate CkLoop cases
When calling CkLoop_Parallelize with numChunks == 1 and no caller function the calling PE should do the work itself.
...
Jim Phillips
12:43 PM Feature #1433 (Merged): lambda syntax for CkLoop
Support syntax similar to the following:... Jim Phillips
12:20 PM Bug #1424 (Implemented): Improve performance of randomized message queueing
https://charm.cs.illinois.edu/gerrit/2258 Phil Miller
12:08 PM Documentation #1432 (Merged): Document CkLoop caller function
This API change is not reflected in the manual:
https://charm.cs.illinois.edu/gerrit/#/c/947/
Jim Phillips
10:30 AM Bug #1431 (New): Charmxi runs out of memory
There was an email on the charm mailing list not too long ago from an ADHydro developer here: https://lists.cs.illino... Sam White
10:21 AM Bug #1430 (Merged): CthThread tracing broken - threads show up as black, dummy_thread_chare assig...
Running NAMD with three patches per pe, each of which has a Cthread (created by CthCreate, managed by CthSuspend and ... Jim Phillips
09:14 AM Bug #1429 (Merged): AMPI failure after Isomalloc migration on multicore/SMP darwin builds
... Sam White

02/16/2017

03:44 PM Feature #1394: Node-level message aggregation for CkMulticast
Core decided that a Node Group should be added on top of the current Group CkMulticastMgr Sam White
03:43 PM Bug #1404 (Merged): Support Cray CC on {mpi,gni}-crayxe
Sam White
03:42 PM Bug #1412 (Merged): AMPI collectives on COMM_SELF using derived datatypes are broken
Sam White
03:40 PM Bug #1390 (Merged): AMPI_Alltoall crashes for short messages
Patch link : https://charm.cs.illinois.edu/gerrit/#/c/2254/ Karthik Senthil
02:58 PM Feature #1428 (New): AMPI TLS privatization support for IBM POWER
AMPI lacks support for -tlsglobals on IBM POWER systems, which we will want to have in place before Summit and Sierra... Sam White
02:56 PM Feature #1427 (New): Virtualize handling of Fortran IO units
This is an issue for AMPI in SMP mode. Fortran IO is dangerous in that it can be used in a non-thread-safe manner. Fi... Sam White
02:55 PM Feature #1426 (New): AMPI F08 bindings
The MPI-3.0 standard added Fortran 2008 bindings. There have already been proposals to MPI-4.0 to deprecate the Fortr... Sam White
02:55 PM Feature #1425 (In Progress): Virtualization-aware AMPI collectives
AMPI collectives currently use Charm++ equivalents where possible, but AMPI's runtime doesn't optimize beyond that. W... Sam White
12:06 PM Bug #1424 (Merged): Improve performance of randomized message queueing
Randomized queue ordering is a great debugging trick to expose race conditions in program logic. However, its impleme... Phil Miller
 

Also available in: Atom