Project

General

Profile

Activity

From 02/27/2017 to 03/28/2017

03/27/2017

10:51 AM Bug #1341 (Merged): AMPI fails to link on mpi-win64
Sam White
10:48 AM Bug #1474 (New): mpi-win-x86_64 fails in collidethread example
mpi-win-x86_64 has failed in this example the last 2 days (since AMPI was fixed on mpi-win):... Sam White

03/26/2017

09:16 PM Bug #1341 (Resolved): AMPI fails to link on mpi-win64
Matthias Diener

03/25/2017

01:17 PM Bug #1341 (Implemented): AMPI fails to link on mpi-win64
Matthias Diener
08:21 PM Bug #1341: AMPI fails to link on mpi-win64
Committed a fix: https://charm.cs.illinois.edu/gerrit/2339 Matthias Diener

03/23/2017

11:32 AM Feature #1070: Migrate lagging 'net' builds to 'netlrts'
Sam White wrote:
> net-linux-amd64 and net-linux-cell are the only net builds left without netlrts equivalents. Do w...
Jim Phillips
10:45 AM Feature #1070: Migrate lagging 'net' builds to 'netlrts'
net-linux-amd64 and net-linux-cell are the only net builds left without netlrts equivalents. Do we even want to have ... Sam White
10:44 AM Cleanup #1363: Remove/deprecate dead machine layers
Remaining builds we could potentially remove?... Sam White
10:41 AM Bug #1201: SMP builds segfault on NULL lock in tests/charm++/chkpt
That failure was happening inside CkNodeReductionMgr, that code has been removed entirely now right? Sam White
10:34 AM Feature #1074 (Implemented): Migrate net-linux-ppc to netlrts
https://charm.cs.illinois.edu/gerrit/#/c/2336/ Sam White
10:29 AM Feature #1072 (Implemented): Migrate net-linux-arm7 to netlrts
https://charm.cs.illinois.edu/gerrit/#/c/2336/
Next we would like to start having autobuild test this...
Sam White
10:26 AM Feature #1187: Automatically delegation section work to CkMulticastMgr
This has been fixed right? If so, please post links to the patches that fixed it and mark the issue merged. Sam White

03/22/2017

11:29 AM Bug #1473 (Implemented): verbs build hangs in tests/charm++/communication_overhead
Test is indeed broken, disablement commit pushed for now. Phil Miller
11:27 AM Bug #1397 (New): Document that array creation must occur on PE0
Indeed, this is intended behavior, that does limit usage that was previously allowed. As you note, there's a fairly e... Phil Miller
10:47 AM Bug #1397: Document that array creation must occur on PE0
I finally got around to reproduce this problem with the attached minimal example. This mimics what I'm trying to do: ... Jozsef Bakosi

03/21/2017

02:45 PM Bug #664: charm++/communication_overhead test fails with randomized queues
So, there seems to be the possibility of a race between operationFinished to end one cycle and getting a message from... Phil Miller
01:10 PM Bug #1470: Investigate broken load balancers in mini-apps
Lulesh can be run with load balancing with a few minor changes like updating uses of CmiTrue and atomic. AtSync() cal... Kavitha Chandrasekar
12:52 PM Bug #1473: verbs build hangs in tests/charm++/communication_overhead
Observed what seems like a memory leak in operationFinished, in that the message it receives is sometimes not deleted... Phil Miller
12:28 PM Bug #1473: verbs build hangs in tests/charm++/communication_overhead
Tried the same test back on 6.7.1. It also hangs, but somewhat later - in the 1D array cases, instead of the group ca... Phil Miller
12:19 PM Bug #1473 (Implemented): verbs build hangs in tests/charm++/communication_overhead
... Phil Miller
10:24 AM Bug #1341: AMPI fails to link on mpi-win64
It looks like ampicc is not properly linking in AMPI's libraries/headers, and from a quick look at src/libs/ck-lib/am... Sam White

03/20/2017

11:41 AM Bug #1471: Parallel Prefix No Barrier Example in Charm Tutorial Hangs on MPI Layer
It works for me on beauty with that same build command (on 1 or multiple PEs)... If you can replicate it, where does ... Sam White

03/16/2017

05:23 PM Bug #1471 (New): Parallel Prefix No Barrier Example in Charm Tutorial Hangs on MPI Layer
As pointed out on the charm mailing list, the parallel prefix no barrier example hangs when run on the mpi layer. I w... Michael Robson
04:52 PM Bug #1470: Investigate broken load balancers in mini-apps
I think we agreed the AMR library could be removed from mainline charm entirely. If someone wants to do LB with AMR t... Sam White
04:39 PM Bug #1470: Investigate broken load balancers in mini-apps
From Kavitha:
For the amr/jacobi2d example, I get the same error as Debashis, for current charm branch. It seems l...
Michael Robson
04:37 PM Bug #1470 (New): Investigate broken load balancers in mini-apps
Excerpt from an external email sent by Debashis Ganguly:
I am able to run leanmd mini-app with 5 different load b...
Michael Robson
02:34 PM Cleanup #1203: AMPI forces builds to be serial for ROMIO
Another annoyance with our build our ROMIO is if you build AMPI+ROMIO, then edit AMPI and do 'make AMPI' in the build... Sam White
02:15 PM Feature #1469 (Merged): Don't require migration constructors for all array objects at compile time
Phil Miller
02:14 PM Bug #484 (Merged): Topology aware spanning trees broken on XE6
Phil Miller

03/15/2017

04:41 PM Feature #1469: Don't require migration constructors for all array objects at compile time
BGQ XLC is happy enough with the fall-back option
BGQ GCC handles the new version cleanly
Manual updated to ref...
Phil Miller
04:27 PM Feature #1469: Don't require migration constructors for all array objects at compile time
BGQ XLC test *failed*. It will not accept the @#include <type_traits>@ as currently configured, because GCC's libstdc... Phil Miller
03:58 PM Feature #1469: Don't require migration constructors for all array objects at compile time
https://charm.cs.illinois.edu/gerrit/2313 Phil Miller
03:57 PM Feature #1469 (Merged): Don't require migration constructors for all array objects at compile time
Currently, every chare array element type is required to have a @CkMigrateMessage*@ constructor that would be used du... Phil Miller
02:17 PM Bug #1447 (Merged): AMPI_Reduce is broken for derived datatypes in user-defined reductions
Sam White
12:12 PM Bug #1341: AMPI fails to link on mpi-win64
Bump. With the following merged, these builds should be the only ones that are persistently failing on AMPI now: http... Sam White
12:10 PM Feature #1457 (Merged): Default option to choose between isomalloc and os-isomalloc
Sam White
11:25 AM Feature #1457 (Implemented): Default option to choose between isomalloc and os-isomalloc
Make charmc use os-isomalloc on Clang non-SMP, with a warning to the user: https://charm.cs.illinois.edu/gerrit/#/c/2... Sam White

03/14/2017

06:40 PM Cleanup #1265: Document LLVM OpenMP runtime integration
The existing @-openmp@ flag should backend to the integrated support when the runtime has been built with the @omp@ o... Phil Miller
03:55 PM Feature #1237 (In Progress): Onesided sender side implementation for GNI layer
Phil Miller
03:41 PM Feature #1237: Onesided sender side implementation for GNI layer
Since there's still several things to fix up here, and we want to get the beta out the door, we're deferring this to ... Phil Miller
03:55 PM Feature #1234 (Merged): Avoid sender-side copy for large contiguous messages. API for charm and c...
API and some machine layer implementations merged. Other machine layers are still pending, but we want to check the b... Phil Miller
03:39 PM Bug #1148 (Merged): Define 'thisIndex' for Groups
Phil Miller
07:41 PM Bug #1148 (In Progress): Define 'thisIndex' for Groups
Karthik Senthil
03:38 PM Bug #1430 (Merged): CthThread tracing broken - threads show up as black, dummy_thread_chare assig...
Phil Miller
01:48 PM Bug #1453 (Merged): Branching factor for group sections is not handled correctly.
Fix was merged over a week ago Phil Miller
12:36 PM Feature #1238 (Merged): Onesided sender side implementation for Verbs layer
Phil Miller
12:12 PM Bug #901 (In Progress): Threads awoken by CthAwaken don't let Projections trace back to the event...
Phil Miller

03/13/2017

11:48 AM Bug #1430 (Implemented): CthThread tracing broken - threads show up as black, dummy_thread_chare ...
https://charm.cs.illinois.edu/gerrit/#/c/2302/
Full revert for now, since the affected functionality is more critica...
Phil Miller
01:39 AM Feature #1458 (Implemented): Zero-copy send support for the MPI machine layer
Gerrit Link: https://charm.cs.illinois.edu/gerrit/#/c/2299/3 Nitin Bhat

03/11/2017

07:15 PM Bug #1148 (Implemented): Define 'thisIndex' for Groups
Patch on Gerrit: https://charm.cs.illinois.edu/gerrit/#/c/2298/ Karthik Senthil

03/10/2017

01:12 PM Feature #1468 (New): Enable pre-pinning memory for RDMA message sends
The cost of memory pinning on Verbs and GNI is high, so we'd like to enable users to pre-pin memory for use in later ... Sam White
11:25 AM Bug #1148: Define 'thisIndex' for Groups
The idea behind making it static was that we could save on memory overhead by having one instance of it. We can discu... Sam White
08:12 PM Bug #1148 (In Progress): Define 'thisIndex' for Groups
Why should @thisIndex@ be defined as a static variable?
I have a current implementation which adds @thisIndex@ as ...
Karthik Senthil
11:16 AM Feature #1467 (New): Avoid memory pinning overhead for RDMA sends within a process
If an RDMA message is being sent to another object in the same process, we already do a direct memcpy rather than an ... Sam White
11:12 AM Feature #969: AMPI support for collectives on inter-communicators
https://charm.cs.illinois.edu/gerrit/#/c/2084/7 contains a consolidated lump of
> - bcast, ibcast
> - barrier, i...
Phil Miller
11:08 AM Feature #969: AMPI support for collectives on inter-communicators
Gather/Igather in progress:
https://charm.cs.illinois.edu/gerrit/2183
Phil Miller
10:59 AM Feature #969: AMPI support for collectives on inter-communicators
Bcast/Ibcast
https://charm.cs.illinois.edu/gerrit/#/c/2127/9
Phil Miller
10:58 AM Feature #969: AMPI support for collectives on inter-communicators
Scatter/Iscatter:
https://charm.cs.illinois.edu/gerrit/#/c/2285/2
Phil Miller
10:56 AM Feature #969: AMPI support for collectives on inter-communicators
Turn enumeration into a checklist, to test out the plugin and track completion mechanically. Phil Miller

03/09/2017

02:49 PM Bug #1462: Programs hang at startup with CUDA build
There is a mempool initialization step that does a bunch of @cudaMallocHost@ calls (which I believe has nothing to do... Jaemin Choi
09:20 PM Bug #1462: Programs hang at startup with CUDA build
You noticed this line, right?... Jim Phillips
11:49 AM Bug #921: Entry tag [inline] is unable to optimize away most of the overhead
Comment 18's definition of the rvalue conversion operator was missing @t = space;@ in the non-owned case Phil Miller
09:20 AM Feature #1466 (New): Update list of available load balancing strategies in the manual
The LB section of the Charm++ manual has a list of available load balancing strategies, which is missing some.
In pa...
Sam White
12:57 AM Feature #1465 (Implemented): Spanning Tree implementation for scatterv
Patch: https://charm.cs.illinois.edu/gerrit/#/c/2296/ Vipul Harsh
09:30 PM Feature #1465 (Implemented): Spanning Tree implementation for scatterv
Use a spanning tree implementation for optimised scatterv Vipul Harsh

03/08/2017

04:36 PM Support #1079: Remove deprecated machine layers and retired machines from Autobuild
Here's some machine layers that are still around that I think we might be able to remove completely:... Sam White
04:21 PM Bug #1464: CUDA example programs hang when run with 1 PE
Thought it might be because of the handler functions indices, so wrapped and moved out @CmiRegisterHandler()@ calls t... Jaemin Choi
12:34 AM Bug #1464 (In Progress): CUDA example programs hang when run with 1 PE
CUDA example programs (@overlapTest@, @concurrentKernels@, @callbacks@, etc.) hang when they are run with only 1 PE.
...
Jaemin Choi
03:49 PM Bug #815: Makefile for hybrid API is not using the system OPTS
Reverted for now, working on a long-term solution Michael Robson
03:20 PM Bug #1462 (In Progress): Programs hang at startup with CUDA build
Exactly same code (@examples/charm++/hello/1darray@) with 4 PEs crashes on @nano6@ but runs fine on @nano7@.
GPUMana...
Jaemin Choi
08:21 AM Bug #1462: Programs hang at startup with CUDA build
There is no way cudaStreamCreate should hang due to GPU load so this would be good to track down. All I can think is... Jim Phillips
12:04 AM Bug #1462 (Closed): Programs hang at startup with CUDA build
This was due to the GPU being heavily used by other processes.
@nvidia-smi -q@ shows the current usage.
When tested...
Jaemin Choi
06:19 PM Bug #1462: Programs hang at startup with CUDA build
The culprit is @cudaStreamCreate()@ in @GPUManager::initHybridAPIHelper()@. No idea why this is the cause yet, becaus... Jaemin Choi

03/07/2017

05:45 PM Bug #1462 (In Progress): Programs hang at startup with CUDA build
Problem seems to be in @initHybridAPI()@ in @ck-core/init.C@, because programs run fine if I comment this out along ... Jaemin Choi
05:20 PM Bug #1462 (In Progress): Programs hang at startup with CUDA build
When using the CUDA build of Charm++, example programs located both under @examples/charm++/cuda@ hang at startup.
...
Jaemin Choi
04:06 PM Bug #1440: smp pes sending messages still block due to other send activity
Re-assigning it to Karthik since he's going to do the projections/performance tests. Bilge Acun

03/06/2017

05:35 PM Feature #1303 (In Progress): Implement MPI-R debugging hooks to support Allinea DDT and Rogue Wav...
Phil Miller
01:15 PM Bug #1410: Investigate completeness of Tuple reduction support
Fix for tuple/set reductions over node groups: https://charm.cs.illinois.edu/gerrit/#/c/2291/ Sam White
12:01 PM Feature #1235 (Merged): Onesided sender side implementation for PAMI layer
Phil Miller
11:53 AM Feature #1459 (New): Zero-copy send support for the netlrts machine layer
In the netlrts machine layer, it's pretty easy to stream data from an arbitrary address to a remote recipient on requ... Phil Miller
11:51 AM Feature #1458 (Implemented): Zero-copy send support for the MPI machine layer
As Jim noted, it's straightforward for MPI to do in-place sends, and newer versions actually allow RMA, so we should ... Phil Miller
08:25 AM Bug #1445 (Merged): mpi-crayxc failure in megampi
Fixed by making context threads the default on mpi-crayxc: https://charm.cs.illinois.edu/gerrit/#/c/2282/ Sam White

03/05/2017

01:06 PM Cleanup #1265: Document LLVM OpenMP runtime integration
bump Sam White
10:07 AM Bug #1447 (Implemented): AMPI_Reduce is broken for derived datatypes in user-defined reductions
Oops, not yet Sam White
10:06 AM Bug #1447 (Merged): AMPI_Reduce is broken for derived datatypes in user-defined reductions
Sam White
08:27 PM Bug #1447 (Implemented): AMPI_Reduce is broken for derived datatypes in user-defined reductions
Implemented user-defined reductions on non-contiguous derived datatypes using the non-commutative reduction code path... Sam White
10:06 AM Feature #1457 (Merged): Default option to choose between isomalloc and os-isomalloc
-memory isomalloc does not work on Clang non-SMP builds. -memory os-isomalloc does not work on SMP builds, so add a n... Sam White

03/04/2017

06:07 PM Support #1456 (Feedback): Add more stream callbacks for use after HToD transfer and kernel execution
Patch uploaded to gerrit for review.
[[https://charm.cs.illinois.edu/gerrit/#/c/2288/]]
Jaemin Choi

03/03/2017

04:25 PM Bug #1453 (Implemented): Branching factor for group sections is not handled correctly.
https://charm.cs.illinois.edu/gerrit/#/c/2287 Vipul Harsh
06:37 PM Bug #1453 (Merged): Branching factor for group sections is not handled correctly.
Vipul Harsh
02:34 PM Support #1454: GPUManager API change
Buffer ID (-1) should be last param and set to -1 by default
Also, is there a way to mark copy both ways?
ints ...
Michael Robson
12:48 PM Support #1454 (Feedback): GPUManager API change
Change pushed to gerrit for review.
[[https://charm.cs.illinois.edu/gerrit/#/c/2283/]]
Jaemin Choi
12:06 PM Support #1454 (Feedback): GPUManager API change
Making changes to current GPUManager API to provide a more uniform & segregated API (function calls now start with ha... Jaemin Choi
01:30 PM Support #1456 (Feedback): Add more stream callbacks for use after HToD transfer and kernel execution
The user may have use for callbacks that occur after completion of host-to-device data transfer and kernel execution,... Jaemin Choi
12:54 PM Bug #1424 (Merged): Improve performance of randomized message queueing
Phil Miller
11:38 AM Bug #1288 (Rejected): net-cygwin: megacon crashes when blkinhand and posixth are enabled
net-cygwin is deprecated Sam White
11:38 AM Bug #1287 (Rejected): net-cygwin crashes in megatest/{queens,tempotest,synctest} under --with-pro...
net-cygwin is deprecated Sam White
11:37 AM Bug #1291 (Rejected): net-cygwin crashes in examples/bigsim/sdag
net-cygwin is deprecated Sam White
11:37 AM Bug #1290 (Rejected): net-cygwin crashes in examples/bigsim/emulator
net-cygwin is deprecated Sam White
09:46 AM Feature #1234: Avoid sender-side copy for large contiguous messages. API for charm and converse l...
API patch merged:
https://charm.cs.illinois.edu/gerrit/1273
As expected, no breakage in autobuild resulted.
Phil Miller
06:18 PM Bug #1452: verbs-linux-ppc64le xlC
Do any PPLers have access to summit-dev? If so assign to one of them. Sam White

03/02/2017

05:20 PM Bug #1452 (New): verbs-linux-ppc64le xlC
Need to support verbs (vs net-ibverbs) with xlC on Summit-dev.
Hard to tell people to try verbs because net-ibverbs ...
Jim Phillips
01:40 PM Bug #1447: AMPI_Reduce is broken for derived datatypes in user-defined reductions
Abort: https://charm.cs.illinois.edu/gerrit/#/c/2280/
I can probably get to the 'gather' implementation tomorrow o...
Sam White
01:31 PM Bug #1447: AMPI_Reduce is broken for derived datatypes in user-defined reductions
As noted in meeting, there are two potential quick fixes to at least get codes hitting this case running or failing c... Phil Miller
01:26 PM Bug #1329 (Merged): Hang in exit in TRAM test on gni-crayxc-smp
Karthik Senthil

03/01/2017

03:18 PM Feature #109: Test and merge section ID and manager work
What got merged as a result of this? Is it actually moot due to other changes? Phil Miller
10:29 AM Bug #1430: CthThread tracing broken - threads show up as black, dummy_thread_chare assigned to fo...
Tested 2269, threads trace back to where they were created rather than where they were awakened. Jim Phillips
06:46 PM Feature #1451 (Implemented): NVTX integration for profiling
Uploaded to gerrit for review.
[[https://charm.cs.illinois.edu/gerrit/#/c/2276/]]
Jaemin Choi
06:03 PM Feature #1451 (Implemented): NVTX integration for profiling
Integrating NVIDIA Tools Extension Library (NVTX) in GPUManager for profiling CUDA behavior. Jaemin Choi
06:20 PM Bug #1341: AMPI fails to link on mpi-win64
Can you look at this this week, and if you can't fix it easily it then implement a workaround so that we don't test A... Sam White
06:07 PM Feature #1393: Redesign of GPUManager to utilize concurrent kernel execution and stream callbacks
Previous gerrit commit split into multiple smaller ones.
New gerrit commit: [[https://charm.cs.illinois.edu/gerrit/#...
Jaemin Choi
06:06 PM Support #1450 (Resolved): Refactor CUDA example programs to fit new GPUManager design
On gerrit for review.
[[https://charm.cs.illinois.edu/gerrit/#/c/2275/]]
Jaemin Choi

02/28/2017

02:29 PM Support #1450 (Resolved): Refactor CUDA example programs to fit new GPUManager design
CUDA example programs under @examples/charm++/cuda@ need to be refactored with the new design of GPUManager.
Especia...
Jaemin Choi
01:31 PM Bug #1445: mpi-crayxc failure in megampi
The stack trace looks like a failure during a context switch but is pretty opaque. What is weird is that the other AM... Sam White
01:26 PM Bug #1445: mpi-crayxc failure in megampi
The following is a trace of the crash on Edison:... Karthik Senthil
01:28 PM Feature #1449 (New): AMPI support for MPI_Win_allocate_shared
We already have an application of interest that uses this. Supporting its basic functionality is trivial: just alloca... Sam White

02/27/2017

04:28 PM Bug #1448 (New): Potential buffer overflow in mylogin()
There is a potential buffer overflow in mylogin() when having a long username ... Matthias Diener
03:22 PM Bug #1430 (In Progress): CthThread tracing broken - threads show up as black, dummy_thread_chare ...
https://charm.cs.illinois.edu/gerrit/2269 should clear things up for the non-[threaded] entry method case. I'll still... Phil Miller
03:11 PM Bug #1430: CthThread tracing broken - threads show up as black, dummy_thread_chare assigned to fo...
Actually, while potentially nice, I realize that solution won't be acceptable for AMPI, because apparently thread lis... Phil Miller
03:08 PM Bug #1430: CthThread tracing broken - threads show up as black, dummy_thread_chare assigned to fo...
Jim, could you try things out with mainline charm (without that change reverted) with added calls to traceAddThreadLi... Phil Miller
11:21 AM Bug #1447 (Merged): AMPI_Reduce is broken for derived datatypes in user-defined reductions
Users are allowed to use derived datatypes in custom reduction functions, but AMPI does not support that and might si... Sam White
 

Also available in: Atom