Project

General

Profile

Activity

From 03/06/2017 to 04/04/2017

04/04/2017

05:13 PM Feature #1480 (Merged): API to control whether a PE helps other threads that generate CkLoop/Open...
Jim brought up an issue that he doesn't want the PEs tasked with highly-critical PME work to participate in helping o... Phil Miller
05:04 PM Bug #677 (Closed): MPI Wrappers on BG/Q supersede include path's from command line, breaking AMPI...
I asked ANL staff about this way back when, and they opened a PMR with IBM. The outcome seems to be that the DEPRECAT... Phil Miller
04:01 PM Bug #1397: Document that array creation must occur on PE0
Here's the new API that can be used in the case of creation off PE 0: https://charm.cs.illinois.edu/gerrit/#/c/736/ Sam White
03:02 PM Bug #1397: Document that array creation must occur on PE0
Setting priority higher because these are easy to fix things that act as paper-cuts to users Phil Miller
03:58 PM Bug #1197 (Rejected): Charmrun fails to connect to nodes on Taub in netlrts build.
Since this issue occurs specifically on loading gcc/4.9 module, it needs for environment variables to be passed to th... Kavitha Chandrasekar
03:53 PM Bug #833: mpi smp build is locked to one core per node by default
Since we won't have hwloc integration for the release, I think it would be good to get this into 6.8.0 if it is ready? Sam White
03:52 PM Bug #1035: Idle PEs compete with comm thread for node queue lock
We are not planning on merging the lockless queue before the 6.8.0 release, since it is high risk this close to the r... Sam White
03:50 PM Bug #1002 (Merged): Changa's Final CkWaitQD() hangs after AtSync deletion-counting changes
Closing unless there are any objections... Sam White
03:43 PM Bug #1452: verbs-linux-ppc64le xlC
This should be quick/easy Sam White
03:43 PM Bug #522: static linking breaks on multicore builds with 'undefined reference to `get_myaddress''
It would seem that the thing to do about this would be to statically link everything with the exception of @libc@. Th... Phil Miller
03:43 PM Bug #999 (Merged): netlrts writeableDgrams is never reset to 0
The part of the above change that wasn't directly addressing this bug got reverted, but the fix for this is solidly i... Phil Miller
03:42 PM Feature #693 (New): add CcdCallBacksReset() to header file (or improve callback frequency)
Michael Robson
03:41 PM Feature #693 (In Progress): add CcdCallBacksReset() to header file (or improve callback frequency)
Michael Robson
03:41 PM Bug #159: Some CkCallback types are not valid across checkpoint/restart
Not seeming to affect any current applications, so deferring. Phil Miller
03:09 PM Documentation #994 (Implemented): The Projections interface function 'traceUserSuppliedData()' is...
https://charm.cs.illinois.edu/gerrit/#/c/2354/ Ronak Buch
03:02 PM Documentation #994: The Projections interface function 'traceUserSuppliedData()' is undocumented ...
Setting priority higher because these are easy to fix things that act as paper-cuts to users Phil Miller
03:04 PM Bug #1470: Investigate broken load balancers in mini-apps
Since the mini-apps would work with minimal changes, should we follow up on the email with suggestions? Kavitha Chandrasekar
03:02 PM Bug #1475: Define equality operators for proxies
Setting priority higher because these are easy to fix things that act as paper-cuts to users Phil Miller
03:02 PM Bug #1441: Lies at startup about "The comm. thread both sends and receives messages"
Setting priority higher because these are easy to fix things that act as paper-cuts to users Phil Miller
03:02 PM Bug #1408: Improve visibility and usability of flushTraceLog()
Setting priority higher because these are easy to fix things that act as paper-cuts to users Phil Miller
03:02 PM Documentation #1225: Document TRAM [aggregate] entry method attribute
Setting priority higher because these are easy to fix things that act as paper-cuts to users Phil Miller
02:46 PM Bug #1479 (In Progress): Charm++ Fails to Compile on Arch Linux
https://charm.cs.illinois.edu/gerrit/2352
The @cpp@ part of this issue is addressed in the patch linked.
Phil Miller
01:13 PM Bug #1479: Charm++ Fails to Compile on Arch Linux
Hi Nils,
Thanks for your report. Could you say what shell Antergos uses as /bin/sh by default, or what you have in...
Phil Miller
01:35 PM Bug #1162: tracing runs segfault while writing logs
I thought I had added a log here before, but I guess not. I had tried to replicate this when the bug was originally ... Ronak Buch
01:10 PM Bug #1162: tracing runs segfault while writing logs
Has there been any investigation or follow-up on this? This could be pretty crippling for large-scale performance work. Phil Miller
01:25 PM Cleanup #473: Licensing of library code in Data Transfer library
www.magic-software.com is no longer a valid URL. Wayback machine doesn't have a copy due to robots.txt.
Magic So...
Eric Bohm
01:17 PM Bug #1392 (Resolved): Stampede test script fails during autobuild (verbs)
This would seem to have been dealt with, given that autobuild runs on Stampede pass. Please close if that's correct, ... Phil Miller
12:42 PM Feature #1039: reject pemap/commap with duplicate or too few cpus
Validation of the old command lines should be reconsidered when the binding substrate changes to hwloc. Eric Bohm
12:34 PM Feature #1386: ckDestroy for Groups and NodeGroups
Retargetting this to a later version until someone makes a case for it being urgently required in 6.8.0. Eric Bohm

04/03/2017

06:31 PM Bug #965: ampicc -swapglobals is broken for ld v2.24+
Hacked ld in repo
@git clone charmgit:users/phil/binutils-gdb -b phil/ampi-swapglobals-hack@
Phil Miller
06:20 PM Bug #965 (In Progress): ampicc -swapglobals is broken for ld v2.24+
I have confirmed that with a modified ld to disable this optimization/conversion, and calling it by setting its path ... Phil Miller
05:30 PM Bug #965: ampicc -swapglobals is broken for ld v2.24+
The modified code to apply this conversion in modern ld is unconditional. Local references always get smashed. If we ... Phil Miller
01:39 PM Bug #965: ampicc -swapglobals is broken for ld v2.24+
Moving to PIE compilation/linking on the newer LD doesn't fix things. It still sees a 'local' reference to a global v... Phil Miller
01:09 PM Bug #965: ampicc -swapglobals is broken for ld v2.24+
And indeed, when compiled on Might, swapglobals appears to operate correctly. Phil Miller
01:06 PM Bug #965: ampicc -swapglobals is broken for ld v2.24+
Here's the patch that implemented that opitmization in GNU LD:
https://sourceware.org/ml/binutils/2012-08/msg00498.h...
Phil Miller
12:18 PM Bug #965: ampicc -swapglobals is broken for ld v2.24+
OK, so first issue on this is that the relocations in question are actually being optimized out at link time, so they... Phil Miller
11:22 AM Bug #965: ampicc -swapglobals is broken for ld v2.24+
I just tried out tests/ampi/jacobi3d on Courage, Ubuntu 14.04 with GCC 4.8.4 and ld 2.24, and it doesn't output the e... Phil Miller

04/02/2017

08:46 PM Bug #1479 (Merged): Charm++ Fails to Compile on Arch Linux
The default configuration of charm++ (at least v6.7.1) fails to build on the Arch Linux distro Antergos (I haven't tr... Nils Deppe

03/31/2017

04:22 PM Feature #1478 (Closed): Investigate use of pxshm in CmiAlloc
Currently, when the runtime is built with pxshm support, we use an extra copy into a posix shared memory buffer when ... Sam White
11:20 AM Bug #1477 (New): All Load Balancing Strategies should be CPU frequency (rate) aware
Under an assumption of full or near-full CPU load when applications and the RTS are running well, we've found that Tu... Phil Miller

03/30/2017

04:55 PM Cleanup #1476 (New): Fix Make.depends for libraries
Make.depends for libraries no longer works. It appears that when the main Make.depends was updated to use the correct... Eric Mikida
02:53 PM Bug #1475 (Merged): Define equality operators for proxies
We can overload the equality operator for proxies. This stems from John Mitchell's email to the charm mailing list. Sam White

03/28/2017

12:45 PM Bug #1474: mpi-win-x86_64 fails in collidethread example
So, in other words, we should compile/link/use *only* AMPI, not the system MPI, in the @charmc -language ampi@ case? Matthias Diener
12:40 PM Bug #1474: mpi-win-x86_64 fails in collidethread example
Note that even when removing the fix for AMPI on mpi-win (Bug #1341), the crash is still the same. Matthias Diener
12:11 PM Bug #1474: mpi-win-x86_64 fails in collidethread example
The problem may be with the fact that this is using charmc's '-language ampi' option directly instead of using 'ampic... Sam White
12:04 PM Bug #1474: mpi-win-x86_64 fails in collidethread example
Crashes with a segmentation fault even when run sequentially before starting main():... Matthias Diener
09:07 AM Bug #1474: mpi-win-x86_64 fails in collidethread example
mpi-win-x86_64-smp also fails in this the same test Sam White

03/27/2017

10:51 AM Bug #1341 (Merged): AMPI fails to link on mpi-win64
Sam White
10:48 AM Bug #1474 (Merged): mpi-win-x86_64 fails in collidethread example
mpi-win-x86_64 has failed in this example the last 2 days (since AMPI was fixed on mpi-win):... Sam White

03/26/2017

09:16 PM Bug #1341 (Resolved): AMPI fails to link on mpi-win64
Matthias Diener

03/25/2017

01:17 PM Bug #1341 (Implemented): AMPI fails to link on mpi-win64
Matthias Diener
08:21 PM Bug #1341: AMPI fails to link on mpi-win64
Committed a fix: https://charm.cs.illinois.edu/gerrit/2339 Matthias Diener

03/23/2017

11:32 AM Feature #1070: Migrate lagging 'net' builds to 'netlrts'
Sam White wrote:
> net-linux-amd64 and net-linux-cell are the only net builds left without netlrts equivalents. Do w...
Jim Phillips
10:45 AM Feature #1070: Migrate lagging 'net' builds to 'netlrts'
net-linux-amd64 and net-linux-cell are the only net builds left without netlrts equivalents. Do we even want to have ... Sam White
10:44 AM Cleanup #1363: Remove/deprecate dead machine layers
Remaining builds we could potentially remove?... Sam White
10:41 AM Bug #1201: SMP builds segfault on NULL lock in tests/charm++/chkpt
That failure was happening inside CkNodeReductionMgr, that code has been removed entirely now right? Sam White
10:34 AM Feature #1074 (Implemented): Migrate net-linux-ppc to netlrts
https://charm.cs.illinois.edu/gerrit/#/c/2336/ Sam White
10:29 AM Feature #1072 (Implemented): Migrate net-linux-arm7 to netlrts
https://charm.cs.illinois.edu/gerrit/#/c/2336/
Next we would like to start having autobuild test this...
Sam White
10:26 AM Feature #1187: Automatic delegation of section work to CkMulticastMgr
This has been fixed right? If so, please post links to the patches that fixed it and mark the issue merged. Sam White

03/22/2017

11:29 AM Bug #1473 (Implemented): verbs build hangs in tests/charm++/communication_overhead
Test is indeed broken, disablement commit pushed for now. Phil Miller
11:27 AM Bug #1397 (New): Document that array creation must occur on PE0
Indeed, this is intended behavior, that does limit usage that was previously allowed. As you note, there's a fairly e... Phil Miller
10:47 AM Bug #1397: Document that array creation must occur on PE0
I finally got around to reproduce this problem with the attached minimal example. This mimics what I'm trying to do: ... Jozsef Bakosi

03/21/2017

02:45 PM Bug #664: charm++/communication_overhead test fails with randomized queues
So, there seems to be the possibility of a race between operationFinished to end one cycle and getting a message from... Phil Miller
01:10 PM Bug #1470: Investigate broken load balancers in mini-apps
Lulesh can be run with load balancing with a few minor changes like updating uses of CmiTrue and atomic. AtSync() cal... Kavitha Chandrasekar
12:52 PM Bug #1473: verbs build hangs in tests/charm++/communication_overhead
Observed what seems like a memory leak in operationFinished, in that the message it receives is sometimes not deleted... Phil Miller
12:28 PM Bug #1473: verbs build hangs in tests/charm++/communication_overhead
Tried the same test back on 6.7.1. It also hangs, but somewhat later - in the 1D array cases, instead of the group ca... Phil Miller
12:19 PM Bug #1473 (Implemented): verbs build hangs in tests/charm++/communication_overhead
... Phil Miller
10:24 AM Bug #1341: AMPI fails to link on mpi-win64
It looks like ampicc is not properly linking in AMPI's libraries/headers, and from a quick look at src/libs/ck-lib/am... Sam White

03/20/2017

11:41 AM Bug #1471: Parallel Prefix No Barrier Example in Charm Tutorial Hangs on MPI Layer
It works for me on beauty with that same build command (on 1 or multiple PEs)... If you can replicate it, where does ... Sam White

03/16/2017

05:23 PM Bug #1471 (New): Parallel Prefix No Barrier Example in Charm Tutorial Hangs on MPI Layer
As pointed out on the charm mailing list, the parallel prefix no barrier example hangs when run on the mpi layer. I w... Michael Robson
04:52 PM Bug #1470: Investigate broken load balancers in mini-apps
I think we agreed the AMR library could be removed from mainline charm entirely. If someone wants to do LB with AMR t... Sam White
04:39 PM Bug #1470: Investigate broken load balancers in mini-apps
From Kavitha:
For the amr/jacobi2d example, I get the same error as Debashis, for current charm branch. It seems l...
Michael Robson
04:37 PM Bug #1470 (Closed): Investigate broken load balancers in mini-apps
Excerpt from an external email sent by Debashis Ganguly:
I am able to run leanmd mini-app with 5 different load b...
Michael Robson
02:34 PM Cleanup #1203: AMPI forces builds to be serial for ROMIO
Another annoyance with our build our ROMIO is if you build AMPI+ROMIO, then edit AMPI and do 'make AMPI' in the build... Sam White
02:15 PM Feature #1469 (Merged): Don't require migration constructors for all array objects at compile time
Phil Miller
02:14 PM Bug #484 (Merged): Topology aware spanning trees broken on XE6
Phil Miller

03/15/2017

04:41 PM Feature #1469: Don't require migration constructors for all array objects at compile time
BGQ XLC is happy enough with the fall-back option
BGQ GCC handles the new version cleanly
Manual updated to ref...
Phil Miller
04:27 PM Feature #1469: Don't require migration constructors for all array objects at compile time
BGQ XLC test *failed*. It will not accept the @#include <type_traits>@ as currently configured, because GCC's libstdc... Phil Miller
03:58 PM Feature #1469: Don't require migration constructors for all array objects at compile time
https://charm.cs.illinois.edu/gerrit/2313 Phil Miller
03:57 PM Feature #1469 (Merged): Don't require migration constructors for all array objects at compile time
Currently, every chare array element type is required to have a @CkMigrateMessage*@ constructor that would be used du... Phil Miller
02:17 PM Bug #1447 (Merged): AMPI_Reduce is broken for derived datatypes in user-defined reductions
Sam White
12:12 PM Bug #1341: AMPI fails to link on mpi-win64
Bump. With the following merged, these builds should be the only ones that are persistently failing on AMPI now: http... Sam White
12:10 PM Feature #1457 (Merged): Default option to choose between isomalloc and os-isomalloc
Sam White
11:25 AM Feature #1457 (Implemented): Default option to choose between isomalloc and os-isomalloc
Make charmc use os-isomalloc on Clang non-SMP, with a warning to the user: https://charm.cs.illinois.edu/gerrit/#/c/2... Sam White

03/14/2017

06:40 PM Documentation #1265: Document LLVM OpenMP runtime integration
The existing @-openmp@ flag should backend to the integrated support when the runtime has been built with the @omp@ o... Phil Miller
03:55 PM Feature #1237 (In Progress): Onesided sender side implementation for GNI layer
Phil Miller
03:41 PM Feature #1237: Onesided sender side implementation for GNI layer
Since there's still several things to fix up here, and we want to get the beta out the door, we're deferring this to ... Phil Miller
03:55 PM Feature #1234 (Merged): Avoid sender-side copy for large contiguous messages. API for charm and c...
API and some machine layer implementations merged. Other machine layers are still pending, but we want to check the b... Phil Miller
03:39 PM Bug #1148 (Merged): Define 'thisIndex' for Groups
Phil Miller
07:41 PM Bug #1148 (In Progress): Define 'thisIndex' for Groups
Karthik Senthil
03:38 PM Bug #1430 (Merged): CthThread tracing broken - threads show up as black, dummy_thread_chare assig...
Phil Miller
01:48 PM Bug #1453 (Merged): Branching factor for group sections is not handled correctly.
Fix was merged over a week ago Phil Miller
12:36 PM Feature #1238 (Merged): Onesided sender side implementation for Verbs layer
Phil Miller
12:12 PM Bug #901 (In Progress): Threads awoken by CthAwaken don't let Projections trace back to the event...
Phil Miller

03/13/2017

11:48 AM Bug #1430 (Implemented): CthThread tracing broken - threads show up as black, dummy_thread_chare ...
https://charm.cs.illinois.edu/gerrit/#/c/2302/
Full revert for now, since the affected functionality is more critica...
Phil Miller
01:39 AM Feature #1458 (Implemented): Zero-copy send support for the MPI machine layer
Gerrit Link: https://charm.cs.illinois.edu/gerrit/#/c/2299/3 Nitin Bhat

03/11/2017

07:15 PM Bug #1148 (Implemented): Define 'thisIndex' for Groups
Patch on Gerrit: https://charm.cs.illinois.edu/gerrit/#/c/2298/ Karthik Senthil

03/10/2017

01:12 PM Feature #1468 (New): Enable pre-pinning memory for the zero-copy message sends through the Entry ...
The cost of memory pinning on Verbs and GNI is high, so we'd like to enable users to pre-pin memory for use in later ... Sam White
11:25 AM Bug #1148: Define 'thisIndex' for Groups
The idea behind making it static was that we could save on memory overhead by having one instance of it. We can discu... Sam White
08:12 PM Bug #1148 (In Progress): Define 'thisIndex' for Groups
Why should @thisIndex@ be defined as a static variable?
I have a current implementation which adds @thisIndex@ as ...
Karthik Senthil
11:16 AM Feature #1467 (Rejected): Avoid memory pinning overhead for RDMA sends within a process
If an RDMA message is being sent to another object in the same process, we already do a direct memcpy rather than an ... Sam White
11:12 AM Feature #969: AMPI support for collectives on inter-communicators
https://charm.cs.illinois.edu/gerrit/#/c/2084/7 contains a consolidated lump of
> - bcast, ibcast
> - barrier, i...
Phil Miller
11:08 AM Feature #969: AMPI support for collectives on inter-communicators
Gather/Igather in progress:
https://charm.cs.illinois.edu/gerrit/2183
Phil Miller
10:59 AM Feature #969: AMPI support for collectives on inter-communicators
Bcast/Ibcast
https://charm.cs.illinois.edu/gerrit/#/c/2127/9
Phil Miller
10:58 AM Feature #969: AMPI support for collectives on inter-communicators
Scatter/Iscatter:
https://charm.cs.illinois.edu/gerrit/#/c/2285/2
Phil Miller
10:56 AM Feature #969: AMPI support for collectives on inter-communicators
Turn enumeration into a checklist, to test out the plugin and track completion mechanically. Phil Miller

03/09/2017

02:49 PM Bug #1462: Programs hang at startup with CUDA build
There is a mempool initialization step that does a bunch of @cudaMallocHost@ calls (which I believe has nothing to do... Jaemin Choi
09:20 PM Bug #1462: Programs hang at startup with CUDA build
You noticed this line, right?... Jim Phillips
11:49 AM Feature #921: Entry tag [inline] is unable to optimize away most of the overhead
Comment 18's definition of the rvalue conversion operator was missing @t = space;@ in the non-owned case Phil Miller
09:20 AM Documentation #1466 (Merged): Update list of available load balancing strategies in the manual
The LB section of the Charm++ manual has a list of available load balancing strategies, which is missing some.
In pa...
Sam White
12:57 AM Feature #1465 (Implemented): Spanning Tree implementation for scatterv
Patch: https://charm.cs.illinois.edu/gerrit/#/c/2296/ Vipul Harsh
09:30 PM Feature #1465 (Implemented): Spanning Tree implementation for scatterv
Use a spanning tree implementation for optimised scatterv Vipul Harsh

03/08/2017

04:36 PM Support #1079: Remove deprecated machine layers and retired machines from Autobuild
Here's some machine layers that are still around that I think we might be able to remove completely:... Sam White
04:21 PM Bug #1464: CUDA example programs hang when run with 1 PE
Thought it might be because of the handler functions indices, so wrapped and moved out @CmiRegisterHandler()@ calls t... Jaemin Choi
12:34 AM Bug #1464 (In Progress): CUDA example programs hang when run with 1 PE
CUDA example programs (@overlapTest@, @concurrentKernels@, @callbacks@, etc.) hang when they are run with only 1 PE.
...
Jaemin Choi
03:49 PM Bug #815: Makefile for hybrid API is not using the system OPTS
Reverted for now, working on a long-term solution Michael Robson
03:20 PM Bug #1462 (In Progress): Programs hang at startup with CUDA build
Exactly same code (@examples/charm++/hello/1darray@) with 4 PEs crashes on @nano6@ but runs fine on @nano7@.
GPUMana...
Jaemin Choi
08:21 AM Bug #1462: Programs hang at startup with CUDA build
There is no way cudaStreamCreate should hang due to GPU load so this would be good to track down. All I can think is... Jim Phillips
12:04 AM Bug #1462 (Closed): Programs hang at startup with CUDA build
This was due to the GPU being heavily used by other processes.
@nvidia-smi -q@ shows the current usage.
When tested...
Jaemin Choi
06:19 PM Bug #1462: Programs hang at startup with CUDA build
The culprit is @cudaStreamCreate()@ in @GPUManager::initHybridAPIHelper()@. No idea why this is the cause yet, becaus... Jaemin Choi

03/07/2017

05:45 PM Bug #1462 (In Progress): Programs hang at startup with CUDA build
Problem seems to be in @initHybridAPI()@ in @ck-core/init.C@, because programs run fine if I comment this out along ... Jaemin Choi
05:20 PM Bug #1462 (In Progress): Programs hang at startup with CUDA build
When using the CUDA build of Charm++, example programs located both under @examples/charm++/cuda@ hang at startup.
...
Jaemin Choi
04:06 PM Bug #1440: smp pes sending messages still block due to other send activity
Re-assigning it to Karthik since he's going to do the projections/performance tests. Bilge Acun

03/06/2017

05:35 PM Feature #1303 (In Progress): Implement MPI-R debugging hooks to support Allinea DDT and Rogue Wav...
Phil Miller
01:15 PM Bug #1410: Tuple reducer leaks memory when using set/concat/custom reducers
Fix for tuple/set reductions over node groups: https://charm.cs.illinois.edu/gerrit/#/c/2291/ Sam White
12:01 PM Feature #1235 (Merged): Onesided sender side implementation for PAMI layer
Phil Miller
11:53 AM Feature #1459 (In Progress): Zero-copy send support for the netlrts machine layer
In the netlrts machine layer, it's pretty easy to stream data from an arbitrary address to a remote recipient on requ... Phil Miller
11:51 AM Feature #1458 (Merged): Zero-copy send support for the MPI machine layer
As Jim noted, it's straightforward for MPI to do in-place sends, and newer versions actually allow RMA, so we should ... Phil Miller
08:25 AM Bug #1445 (Merged): mpi-crayxc failure in megampi
Fixed by making context threads the default on mpi-crayxc: https://charm.cs.illinois.edu/gerrit/#/c/2282/ Sam White
 

Also available in: Atom