Project

General

Profile

Activity

From 07/25/2017 to 08/23/2017

Today

05:03 PM Bug #1665 (New): DDT needs to reference count its type objects
Currently AMPI_Type_free is a no-op b/c DDT doesn't implement reference counting of its type objects. Sam White
01:13 PM Bug #1544: CMK_TIMER_USE_PPC64 inaccurate with variable clock speeds
Can you link to the change that fixes this? Jim Phillips
12:15 PM Bug #1545: Serialize std::vector with Custom Allocator
I've thought about this more over the last few months and have a few things to share:
1) I'm not exactly sure what...
Nils Deppe
07:52 PM Bug #1664 (New): Tune pami-linux short/eager communication thresholds
src/arch/pami/machine.c contains the following:... Sam White
07:47 PM Bug #1663 (New): Fix CMK_PAGESIZE definitions
For some reason we have the pagesize as 8192 instead of 4096 on all kinds of Linux-based machine layers:... Sam White

08/22/2017

11:07 AM Support #1662 (Closed): Show error message about adding "--enable-tracing" when charm is built wi...
This bug will be handled as a part of https://charm.cs.illinois.edu/redmine/issues/1661 Nitin Bhat
11:05 AM Support #1662 (Closed): Show error message about adding "--enable-tracing" when charm is built wi...
Building charm++ with papi support, but without the --enable-tracing option (./build charm++ netlrts-linux-x86_64 pap... Nitin Bhat
11:00 AM Bug #1661: Building charm with papi support is successful even when papi is not found (-lpapi)
Also add an abort when building papi without --enable-tracing, and update the PAPI Projections documentation. Sam White
09:34 AM Bug #1661 (New): Building charm with papi support is successful even when papi is not found (-lpapi)
When I try to build charm with papi support like ... Nitin Bhat
03:19 AM Bug #1544 (Implemented): CMK_TIMER_USE_PPC64 inaccurate with variable clock speeds
Ronak Buch
03:14 AM Projections Feature #997 (Implemented): Color by user supplied parameter (e.g. timestep) in Communication ove...
Ronak Buch
03:14 AM Projections Feature #996 (Implemented): Color by user supplied parameter (e.g. timestep) in Time Profile
Ronak Buch
03:14 AM Projections Feature #995 (Implemented): Color by user supplied parameter (e.g. timestep) in non-timeline tools
Ronak Buch

08/21/2017

04:01 PM Bug #1660 (New): AMPI tests /tests/migration and tests/megaampi fail on Stampede2 with MPI and OF...
The tests - tests/migration and tests/megampi fail when run on two processors on Stampede2 with the build mpi-linux-x... Nitin Bhat
03:24 PM Bug #1639 (Merged): AMPI MPI_IN_PLACE support is broken
Sam White

08/18/2017

09:23 AM Bug #1659: Cleanup DDT memory leaks / valgrind output
All 3 of these commits seem to make no difference in Valgrind output, though they make DDT more readable:
Use std:...
Sam White

08/17/2017

02:26 PM Bug #1659: Cleanup DDT memory leaks / valgrind output
Here's the valgrind output from PE 1 after migrating a single AMPI rank from PE 0 to PE 1. Sam White
02:03 PM Bug #1659: Cleanup DDT memory leaks / valgrind output
I'm using DDT after all dynamic memory allocation inside the CkDDT_DataType classes has been replaced with std::vecto... Sam White
02:01 PM Bug #1659: Cleanup DDT memory leaks / valgrind output
When we migrate only 1 rank from PE 0 to PE 1, only the valgrind output from PE 1 shows the DDT leaks. PE 0 is clean. Sam White
01:59 PM Bug #1659 (New): Cleanup DDT memory leaks / valgrind output
DDT causes a lot of noise in Valgrind output when migrating AMPI ranks. I haven't been able to find the source of its... Sam White

08/16/2017

12:49 PM Feature #975: OFI Layer
Other than pingpong, I haven't done any other synthetic performance tests.
I have tested it on NAMD, ChaNGa and Ope...
Nitin Bhat
12:39 PM Feature #975: OFI Layer
Do you have any synthetic performance tests besides ping-pong?
If you can make this compatible with the 6.8.0 head I...
Jim Phillips
10:02 AM Feature #975 (Implemented): OFI Layer
The current patch shows decent performance improvements over the MPI build on both Stampede2 and Bridges.
This pa...
Nitin Bhat

08/14/2017

11:43 AM Bug #1507 (Merged): ckio test failure on gni-crayxc
Thank you, done. Sorry for the trouble, and thank you for the fixes. Phil Miller
11:37 AM Bug #1507: ckio test failure on gni-crayxc
With the recent fix, I can no longer reproduce this bug. I recommend this issue be closed. Thomas Quinn
10:04 AM Support #222: Port miniFE to Charm++
Well, I think there was an unfixed issue with reproducibility with this. When running on multicore builds it seems to... Justin Szaday
09:59 AM Support #222 (Closed): Port miniFE to Charm++
This is now done, after having sat idle for a long time.
charmgit:benchmarks/mantevo/miniFE-2.0
Phil Miller

08/12/2017

04:49 PM Bug #1658: Premature detection of Quiescence when TRAM is being used
Confirmed that the patch fixes the test, even when modified to run itself in multiple iterations in a single job. Phil Miller
04:48 PM Bug #1658: Premature detection of Quiescence when TRAM is being used
https://charm.cs.illinois.edu/gerrit/2907 Phil Miller
04:27 PM Bug #1658 (Implemented): Premature detection of Quiescence when TRAM is being used
Confirmed that the attached test code fails consistently on netlrts-darwin-x86_64, with 2 PEs running on a single hos... Phil Miller
02:56 PM Bug #1658: Premature detection of Quiescence when TRAM is being used
The patch that fixes it is attached. I suggest, after due scrutiny and testing, we merge this, so that users have a b... Laxmikant "Sanjay" Kale
02:50 PM Bug #1658: Premature detection of Quiescence when TRAM is being used
A simple test program I wrote (after fixing the bug) demonstrates the problem consistently. Its a variation on hello,... Laxmikant "Sanjay" Kale
02:45 PM Bug #1658: Premature detection of Quiescence when TRAM is being used
The bug is (I am reasonably sure) due to faulty quiescence detection algorithm in qd.C. It employees 2 phases. Once t... Laxmikant "Sanjay" Kale
02:25 PM Bug #1658 (Implemented): Premature detection of Quiescence when TRAM is being used
This was identified in Charm++ version of Quicksilver by Karthik and Nikhil, in an LLNL internship project. There are... Laxmikant "Sanjay" Kale
07:09 PM Documentation #1482 (Merged): Update Charm++ FAQ
Sam White
07:09 PM Documentation #1602 (Merged): FAQ: 5 . 0 . 3 Can I use TotalView?
Sam White
07:09 PM Documentation #1605 (Merged): FAQ: 6 . 0 . 8 Which C++ language features cause porting problems?
Sam White

08/11/2017

12:12 PM Documentation #1605 (Implemented): FAQ: 6 . 0 . 8 Which C++ language features cause porting prob...
https://charm.cs.illinois.edu/gerrit/2900 Phil Miller
11:15 AM Documentation #1602 (Implemented): FAQ: 5 . 0 . 3 Can I use TotalView?
https://charm.cs.illinois.edu/gerrit/2895 Phil Miller

08/10/2017

12:44 PM Bug #1634: HDF5 issues in AMPI
I have a patch to update romio to 1.2.6 (shipped with last version of mpich1) that compiles successfully with the cur... Matthias Diener
12:09 PM Bug #1634: HDF5 issues in AMPI
I'd like to know to know if it is building on AMPI yet, or if it requires any MPI-2 or MPI-3 features we don't have i... Sam White

08/09/2017

11:33 AM Feature #1655: Enable use of pxshm/xpmem on mpi and verbs builds
Also, './build charm++ gni-crayxe xpmem' fails to build because it tries to build pxshm and xpmem both for some reason. Sam White
09:23 AM Feature #1657 (New): pxshm/xpmem support for nocopy sends across processes on the same host
It should be straightforward to implement this at least for the transfer of the nocopy payload: the small metadata me... Sam White
09:17 AM Bug #1634: HDF5 issues in AMPI
What's the status of updating ROMIO to get shared library support? Sam White

08/08/2017

05:09 PM Bug #1507: ckio test failure on gni-crayxc
The fix I pushed is a direct result of digging in to the crash here. I'm almost certain that this is fixed, but I am... Thomas Quinn
04:45 PM Bug #1507 (Feedback): ckio test failure on gni-crayxc
Tom, with the other fix that you recently pushed, could you test that this still reproduces, and potentially open a n... Phil Miller
04:29 PM Bug #1647 (Closed): ckNew(): CkReductionMgr not constructed on all PEs
Redmine #1652 is believed to be the root cause of this issue, and it's fix has been merged into 6.8.0. Sam White
04:29 PM Bug #1652 (Merged): CkArray::ckDestroy() does not delete CkMulticastMgr
Sam White
03:40 PM Bug #1652 (Implemented): CkArray::ckDestroy() does not delete CkMulticastMgr
We'll merge this into 6.8.0 Sam White
03:45 PM Documentation #1656 (New): Update manual entries on Load Balancing strategies
The section describing the built-in strategies should be more descriptive of the trade-offs in strategies and should ... Sam White
02:46 PM Documentation #1082 (Merged): Improve SMP mode documentation
Matthias Diener
12:43 PM Feature #1655 (New): Enable use of pxshm/xpmem on mpi and verbs builds
pxshm is currently only supported on the GNI and NetLRTS layers. It should be trivial to allow its use on MPI and Ver... Sam White
07:59 PM Bug #1653: NeighborLB segfaults during startup in SMP/multicore builds
The same failure is seen in SMP mode. Sam White
07:17 PM Bug #1653 (New): NeighborLB segfaults during startup in SMP/multicore builds
Running with NeighborLB causes a failure during initialization on multicore builds. I haven't tried SMP builds, but n... Sam White

08/07/2017

08:44 AM Documentation #1602: FAQ: 5 . 0 . 3 Can I use TotalView?
This and 1605 are the last open issues for 6.8.0 Sam White

08/05/2017

09:57 AM Bug #1652: CkArray::ckDestroy() does not delete CkMulticastMgr
A proposed fix:
https://charm.cs.illinois.edu/gerrit/2871
Thomas Quinn
11:56 PM Bug #1652 (Merged): CkArray::ckDestroy() does not delete CkMulticastMgr
I think this is the fundamental cause of bug #1647 but I'm resubmitting to give it a more accurate name.
ckNew() c...
Thomas Quinn
07:02 PM Bug #1647: ckNew(): CkReductionMgr not constructed on all PEs
I've now learned about _lookupGroupAndBufferIfNotThere(). It checks if there is a NULL entry in _groupTable, and que... Thomas Quinn

08/04/2017

05:46 PM Documentation #1596 (Merged): FAQ: 2 . 0 . 7 How do I specify the processors I want to use?
Matthias Diener
02:26 PM Bug #1649 (Implemented): NullLB shouldn't wait for LB period
https://charm.cs.illinois.edu/gerrit/#/c/2868/ Sam White
01:44 PM Bug #1649: NullLB shouldn't wait for LB period
Discussed this issue, Juan suggested one solution could be that we call ResumeFromSync from CkMigratable::AtSync when... Kavitha Chandrasekar

08/03/2017

04:06 PM Bug #1651 (Implemented): AMPI Persistent send/recv requests are broken
https://charm.cs.illinois.edu/gerrit/#/c/2865/ Sam White
03:26 PM Bug #1651 (In Progress): AMPI Persistent send/recv requests are broken
Sam White
02:50 PM Bug #1651 (Implemented): AMPI Persistent send/recv requests are broken
AMPI's support for all kinds of persistent requests is broken. Also the current implementation is kludgy, and could u... Sam White
03:48 PM Bug #1220: AMPI: Support tlsglobals with dynamically linked objects
In charm/src/util/cmitls.c, the routine getTLSPhdrEntry() iterates over all the entries in ELF program header and che... Sam White
03:26 PM Bug #1239: Cleanup reduction uses in the runtime
Can you add links to those gerrit patches here? Sam White
01:45 PM Bug #1650 (Merged): Race condition in CentralLB concurrent mode can cause crash
Sam White
11:52 AM Bug #1650 (Implemented): Race condition in CentralLB concurrent mode can cause crash
Juan Galvez
11:19 AM Bug #1650: Race condition in CentralLB concurrent mode can cause crash
Issue is that PEs should wait until all other PEs have received counts for object and comm data before sending their ... Juan Galvez
11:08 AM Bug #1650 (Merged): Race condition in CentralLB concurrent mode can cause crash
Right now this affects only GreedyRefine (only load balancer that uses concurrent mode).
Doesn't seem to happen very...
Juan Galvez
11:59 AM Bug #1647: ckNew(): CkReductionMgr not constructed on all PEs
A CkPrint in the RDMA code loop indicates that, as Sam suspected, the CMK_ONESIDED_IMPL is never executed.
It seem...
Thomas Quinn
12:36 AM Feature #1609: User-level thread implementation based on Boost context library
With this library, the OpenMP integration works well on MacOSX with GCC and barrier related directives also works on ... Seonmyeong Bak
12:35 AM Feature #1609: User-level thread implementation based on Boost context library
Currently, I implemented uFcontext using assembly codes from boost context library.
And changed the build script to...
Seonmyeong Bak
12:28 AM Feature #1609: User-level thread implementation based on Boost context library
https://charm.cs.illinois.edu/gerrit/#/c/2860/ Seonmyeong Bak
12:23 AM Feature #1609 (Implemented): User-level thread implementation based on Boost context library
Seonmyeong Bak
12:23 AM Bug #1577 (Implemented): User-level thread based OpenMP integration support on Mac
Seonmyeong Bak

08/02/2017

12:18 PM Cleanup #1311: Align XL-specific conditional compilation (TRAM, std::unordered_map) to relevant v...
If we're requiring C++11 support in 6.9.0 does this issue go away? Sam White
11:33 AM Bug #1647: ckNew(): CkReductionMgr not constructed on all PEs
If ChaNGa isn't explicitly using the new (in v6.8.0) Zero-copy Send API, then no CMK_ONESIDED_IMPL code should ever e... Sam White
09:37 PM Bug #1647: ckNew(): CkReductionMgr not constructed on all PEs
Could ONESIDED be the problem? I stuck a CkPrintf in front of the CMK_ONSIDED_IMPL code in _processHandler() and the... Thomas Quinn
11:27 AM Bug #1649 (Implemented): NullLB shouldn't wait for LB period
Currently NullLB respects the LBPeriod. This means that if your application is calling AtSync more frequently than th... Sam White
07:32 PM Bug #1640 (In Progress): Segfault during migration for AMPI in SMP mode with "-tracemode projecti...
The above workaround was merged for 6.8.0, but we still need to fix the underlying issue. Sam White
07:25 PM Documentation #1588 (Merged): FAQ : How do I use GPGPUs in Charm++?
Sam White

08/01/2017

06:05 PM Bug #1647: ckNew(): CkReductionMgr not constructed on all PEs
That patch did not help. It looks like _processNodeBocInitMsg() doesn't get called at all for the relevant group. Re... Thomas Quinn
03:52 PM Bug #1647: ckNew(): CkReductionMgr not constructed on all PEs
The following patch might affect this?
https://charm.cs.illinois.edu/gerrit/#/c/2538/
Sam White
01:33 PM Bug #1647 (Closed): ckNew(): CkReductionMgr not constructed on all PEs
After restarting from a checkpoint, and during CkIO operations, I am getting errors like:... Thomas Quinn
04:26 PM Documentation #1645 (Resolved): Review Standalone Build/Directives for GPU Manager Documentation
I don't think they're useful, they were left over from when I was initially writing the manual and wanted to just loo... Michael Robson
04:16 PM CharmDebug Feature #1485: CharmDebug in SMP mode does not work
Changing this into a Feature since CharmDebug never worked/implemented for SMP. Bilge Acun
04:01 PM Feature #975: OFI Layer
The newest version of the patch from Intel is here: https://charm.cs.illinois.edu/gerrit/#/c/2759/
Previous patches ...
Bilge Acun
03:57 PM Documentation #1588 (Implemented): FAQ : How do I use GPGPUs in Charm++?
https://charm.cs.illinois.edu/gerrit/#/c/2852/ Michael Robson
03:20 PM Documentation #1596 (Implemented): FAQ: 2 . 0 . 7 How do I specify the processors I want to use?
https://charm.cs.illinois.edu/gerrit/#/c/2850/ Michael Robson
02:58 PM Bug #1640: Segfault during migration for AMPI in SMP mode with "-tracemode projections"
https://charm.cs.illinois.edu/gerrit/#/c/2849/ works around this issue. Matthias Diener
02:58 PM Bug #1640 (Implemented): Segfault during migration for AMPI in SMP mode with "-tracemode projecti...
Matthias Diener
12:06 PM Bug #1640 (In Progress): Segfault during migration for AMPI in SMP mode with "-tracemode projecti...
Eric Bohm
10:14 PM Bug #1640: Segfault during migration for AMPI in SMP mode with "-tracemode projections"
Edit:
Applying https://charm.cs.illinois.edu/gerrit/#/c/2849/ and compiling with -DAMPI_LOCAL_IMPL=0 indeed seems ...
Matthias Diener
01:11 PM Feature #363: Investigate implementation of CCS on BG/Q
BG/Q is dying, so lowering priority and assigning away from Bilge. Sam White
12:05 PM Documentation #1482 (In Progress): Update Charm++ FAQ
Eric Bohm
10:15 AM Bug #1646 (New): Support use of std::array in .ci files
Use of std::array in .ci files conflicts with the .ci keyword 'array'. Sam White

07/31/2017

06:28 PM Bug #1640: Segfault during migration for AMPI in SMP mode with "-tracemode projections"
... Sam White
06:20 PM Bug #1640: Segfault during migration for AMPI in SMP mode with "-tracemode projections"
I also tried with netlrts-darwin-x86_64-smp. Could you post your full build line? Maybe there is an issue with argume... Matthias Diener
06:16 PM Bug #1640: Segfault during migration for AMPI in SMP mode with "-tracemode projections"
Hmm, it fixed the issue for me on netlrts-darwin-x86_64-smp. What build are you running? Sam White
04:04 PM Bug #1640: Segfault during migration for AMPI in SMP mode with "-tracemode projections"
Does disabling inline messaging fix this bug? If yes, should we disable inline messaging also when CMK_TRACE_ENABLED ... Matthias Diener
02:43 PM Bug #1640: Segfault during migration for AMPI in SMP mode with "-tracemode projections"
This gives users a way to build with inline messaging disabled as a workaround: https://charm.cs.illinois.edu/gerrit/... Sam White
11:55 AM Bug #1640: Segfault during migration for AMPI in SMP mode with "-tracemode projections"
I bisected this bug to the following commit:
22ac66875b1b90c52c54b1327efdddf5816abfcd
AMPI: execute local sends ...
Matthias Diener
11:27 AM Bug #1640: Segfault during migration for AMPI in SMP mode with "-tracemode projections"
6.7.1 shipped with a bug in MPI_Info's handling of strings, which would show up in AMPI_Migrate(MPI_Info) calls. That... Sam White
11:13 AM Bug #1640: Segfault during migration for AMPI in SMP mode with "-tracemode projections"
Hmm, testing it again, the crash does seem somewhat different (no mention of isomalloc), and happens only rarely, so ... Matthias Diener
10:50 AM Bug #1640: Segfault during migration for AMPI in SMP mode with "-tracemode projections"
Hmm when I had tested it with 6.7.1 on netlrts-darwin-x86_64-smp, I think it passed, but maybe I didn't run it enough... Sam White
10:44 AM Bug #1640: Segfault during migration for AMPI in SMP mode with "-tracemode projections"
I tested this with 6.7.1, which crashes as well. So it is definitely not a regression.
Looking at Phil's suggestion...
Matthias Diener
10:01 AM Bug #1640: Segfault during migration for AMPI in SMP mode with "-tracemode projections"
We're not actually using [inline] entry methods, we're just calling the C++ methods directly on the object returned b... Sam White
09:52 AM Bug #1640: Segfault during migration for AMPI in SMP mode with "-tracemode projections"
Without digging into the code, I'm guessing the issue is that the tracing code allocates stack objects to track funct... Phil Miller
02:17 PM Documentation #1645 (Resolved): Review Standalone Build/Directives for GPU Manager Documentation
As fixed here: https://charm.cs.illinois.edu/gerrit/#/c/2838/, the stand alone checks for the GPU Manager were messin... Ronak Buch
12:39 PM Feature #1420 (Implemented): Lockless Queues
Bilge Acun
09:49 AM Bug #1642 (Closed): bgclang builds broken
Phil Miller

07/30/2017

09:27 AM Bug #1640: Segfault during migration for AMPI in SMP mode with "-tracemode projections"
It looks like this only happens when a message that is for a recipient VP on the same PE as the sender is sent inline... Sam White

07/28/2017

07:33 AM Documentation #1605: FAQ: 6 . 0 . 8 Which C++ language features cause porting problems?
Bump, the release is expected to be out next week. Sam White
07:33 AM Documentation #1602: FAQ: 5 . 0 . 3 Can I use TotalView?
Bump, the release is expected to be out next week. Sam White
07:32 AM Documentation #1082: Improve SMP mode documentation
Bump, the release is expected to be out next week. Sam White
07:32 AM Documentation #1588: FAQ : How do I use GPGPUs in Charm++?
Bump, the release is expected to be out next week. Sam White
07:32 AM Documentation #1596: FAQ: 2 . 0 . 7 How do I specify the processors I want to use?
Bump, the release is expected to be out next week. Sam White

07/26/2017

05:59 PM Bug #1642 (Resolved): bgclang builds broken
charmconfig.out showed that bgclang was not found:
bgclang++ -Wno-deprecated-declarations -I../include -I. -I/bgs...
Nitin Bhat
07:42 AM Bug #1642: bgclang builds broken
Assigning to Nitin because he's in charge of the autobuild on BGQ. Sam White
07:38 AM Bug #1642 (Closed): bgclang builds broken
The bgclang builds in autobuild have been broken since 7/21/17. Seems like an upgrade to the bgclang compilers might ... Sam White
 

Also available in: Atom