Project

General

Profile

Activity

From 03/06/2019 to 04/04/2019

04/05/2019

10:29 PM Bug #2046 (Merged): TRAM doesn't appear to support higher dimensional chare arrays
Evan Ramos
10:29 PM Bug #1957 (Merged): Out of bounds std::vector accesses in NDMeshStreamer
Evan Ramos

04/04/2019

04:13 PM Cleanup #2057 (Implemented): Revise agreement to cover github distribution and license to address...
Eric Bohm
04:13 PM Documentation #1927 (Resolved): Evaluate other documentation possibilities
Eric Bohm
03:17 PM Feature #177 (In Progress): objid_t: load balancing infrastructure should use objid_t
Reopening and reassigning this because the patch maybe didn't fully address the issue. It uses a combination of 64bit... Eric Mikida
03:14 PM Bug #2066 (New): Use 64bit IDs for tracing
Tracing uses its own weird ID system that should maybe be replaced by 64bit IDs once all runtime system objects can b... Eric Mikida
03:12 PM Cleanup #2065 (New): Use 64bit ID for groups/nodegroups
This should be a fairly trivial change that will help unify IDs a bit, and potentially clean up parts of the charm me... Eric Mikida
03:11 PM Cleanup #2064 (New): Use 64bit ID for singleton chares
Singleton chare IDs are completely different from CkArray IDs, and store direct pointers to the chares. I think it sh... Eric Mikida
02:36 PM Bug #159: Some CkCallback types are not valid across checkpoint/restart
Regarding callbacks that use function pointers, we concluded in core meeting that we will disallow this use of callba... Juan Galvez
02:20 PM Cleanup #1314: Replace widespread dynamic allocated arrays with std::vector
https://charm.cs.illinois.edu/gerrit/c/charm/+/4922 Evan Ramos
02:20 PM Cleanup #1982: Rewrite conv-conds.c in C++
https://charm.cs.illinois.edu/gerrit/c/charm/+/4932 Evan Ramos
02:15 PM Feature #1436 (In Progress): trace CcdCallFnAfter() causality
I think this is doable, but the detail of functions coming from a CcdCallFnAfter may be a bit suspect (since it just ... Ronak Buch
01:41 PM Feature #1987 (In Progress): Take advantage of streamable reductions inside CkMulticast
Raghavendra Kanakagiri
01:24 PM Bug #2056 (Merged): AMPI ROMIO fails to build when MPI_LIB is set
Matthias Diener
12:25 PM Bug #2056 (Implemented): AMPI ROMIO fails to build when MPI_LIB is set
Evan Ramos
12:13 PM Bug #2056: AMPI ROMIO fails to build when MPI_LIB is set
https://charm.cs.illinois.edu/gerrit/c/charm/+/5059 implements a simple fix that unsets MPI_LIB before configuring RO... Matthias Diener
11:51 AM Bug #2056: AMPI ROMIO fails to build when MPI_LIB is set
The user decided to build with --without-romio so we don't know if MPI_LIB ended up being the problem. We should stil... Evan Ramos
11:42 AM Bug #2056: AMPI ROMIO fails to build when MPI_LIB is set
Has there been any feedback from the user? Matthias Diener

04/03/2019

04:32 PM Bug #1887 (Implemented): Custom array indices segfault in CkVec inside of LB framework
The original problem ended up to be just a matter of poor documentation/examples. That is address in this patch: http... Eric Mikida
03:49 PM Feature #69: Get large messages sent via the ZC API to be received directly using a GET from the ...
This feature is only relevant for use cases of the Zero copy Entry Method API and not the regular API.
This API b...
Nitin Bhat
03:05 PM Feature #69: Get large messages sent via the ZC API to be received directly using a GET from the ...
Juan and I had a discussion about this feature on slack.
We finally understood this feature to some extent.
Le...
Nitin Bhat
03:39 PM Bug #828 (Closed): Chare array construction semantics differ wrt readonly array proxy on PE 0
Eric Mikida
03:38 PM Bug #1500: Entry Methods Always Take lvalue References (feature/bug)
Perfect forwarding changes have all been merged. Should this be marked as closed then? I think we covered all the mai... Eric Mikida
11:40 AM Documentation #1638 (Implemented): Manual has incorrect populateInitial API
https://charm.cs.illinois.edu/gerrit/c/charm/+/5056 Eric Mikida
10:46 AM Feature #1040: support multiple InfiniBand cards per node
With a code browse of the PAMI{lrts} machine layers, I couldn't determine if PAMI uses multiple Infiniband cards inte... Nitin Bhat
10:22 AM Bug #1671: Verbs memory pool may leak pinned memory when message is deleted on a PE different fro...
Added a test case: https://charm.cs.illinois.edu/gerrit/c/charm/+/5055, but haven't been able to reproduce the bug so... Nitin Bhat

04/02/2019

02:27 PM Bug #828: Chare array construction semantics differ wrt readonly array proxy on PE 0
No, it can be closed Sam White
01:46 PM Bug #828: Chare array construction semantics differ wrt readonly array proxy on PE 0
Is this required for something important? As Nikhil mentioned above, I think inline array creation is fairly ingraine... Eric Mikida
01:49 PM Cleanup #1872: Move performance tests and benchmarks from "make test" to a new "make benchmark"
We should merge the 6.10-targeted Zero Copy patches, and as many other patches under review that add new tests/exampl... Evan Ramos
01:38 PM Bug #2050 (Closed): Inconsistent AtSync behavior with bound arrays
This was an issue with the application itself. There was inconsistent usage of usesAtSync / ResumeFromSync. The main ... Eric Mikida
01:32 PM Support #2031: Add a new target (potentially benchmarks) for Vesta autobuilds to avoid maximum ex...
Yes, I think it'll be good to get back to this after the reorganization of tests, examples, and benchmarks.
This ...
Nitin Bhat

03/28/2019

05:10 PM Bug #2063 (Merged): CkCallback::ckExit does not work with interop mode
Instead of calling CkExit, which is redefined by mpi-interoperate.h to only exit the library and stop the charm sched... Eric Mikida
04:25 PM Bug #1940: Singleton chare and nodegroup creation hangs with randomized queues in SMP mode
With some further testing, this actually appears to be an error due to the combination of SMP mode and non-bitvec/fix... Michael Robson
12:41 PM Bug #1940: Singleton chare and nodegroup creation hangs with randomized queues in SMP mode
In both cases, charm failed to build. In the first (netlrts) case adding a fixed width priority (e.g. int) enabled ch... Michael Robson
12:30 PM Bug #1940: Singleton chare and nodegroup creation hangs with randomized queues in SMP mode
What do you mean by it wouldn't build for netlrts and mpi? Charm didn't build, or the example didn't build? Can you p... Sam White
12:25 PM Bug #1940 (In Progress): Singleton chare and nodegroup creation hangs with randomized queues in S...
Tried to replicate using fib on various platforms and machines:
* netlrts-darwin on local machine with ++local - wor...
Michael Robson
02:21 PM Bug #1688: Core Dump file not available unless `--disable-charmdebug` is used while building.
Based on discussion in core, we can generate core dump files for when CmiNumNodes() <= 32 or if charm is built withou... Kavitha Chandrasekar
02:09 PM Support #1732 (Resolved): Add CUDA as an option in the build script
Merged: https://charm.cs.illinois.edu/gerrit/#/c/charm/+/5024/ Jaemin Choi
01:46 PM Bug #903: ckexit with interop hangs sometimes
user-driven-interop does not reproduce this bug, it was actually just an error in the test that prevents it from runn... Eric Mikida
11:11 PM Bug #2062 (New): Don't force -lcudart just because cuda.h was found during configure
If CUDA supposed is detected by configure it not only builds the CUDA features, but also modifies the link options fo... Jim Phillips

03/26/2019

05:16 PM Bug #1671 (In Progress): Verbs memory pool may leak pinned memory when message is deleted on a PE...
Nitin Bhat
02:10 PM Support #2031: Add a new target (potentially benchmarks) for Vesta autobuilds to avoid maximum ex...
My task to separate benchmarks from tests might help here. Evan Ramos
02:06 PM Support #2031: Add a new target (potentially benchmarks) for Vesta autobuilds to avoid maximum ex...
On examining the job logs for the reported failures, I found out that all of these are due to exceeding the maximum e... Nitin Bhat
07:39 PM Bug #2060: Communication performance degrades after succesive load balancing when migrating many ...
The reason why the performance gets progressively worse is not because it becomes more likely of this happening, but ... Juan Galvez

03/25/2019

06:58 PM Bug #2060: Communication performance degrades after succesive load balancing when migrating many ...
Was able to confirm with a small 2 chare example. Migrating chares leave behind a chain of location "pointers" in som... Eric Mikida
05:20 PM Bug #2060: Communication performance degrades after succesive load balancing when migrating many ...
Ah that's interesting. It is definitely possible that `CK_GLOBAL_LOCATION_UPDATE` is broken since it is AFAIK unteste... Eric Mikida
04:34 PM Bug #2060: Communication performance degrades after succesive load balancing when migrating many ...
Here is what I think is happening:
When an object migrates, the old PE updates the location to point to the PE whe...
Juan Galvez

03/22/2019

12:42 PM Cleanup #1314 (In Progress): Replace widespread dynamic allocated arrays with std::vector
Evan Ramos
12:42 PM Cleanup #1982 (In Progress): Rewrite conv-conds.c in C++
Evan Ramos

03/21/2019

06:21 PM Cleanup #2054: Improve error message for unregistered chares/entry methods for when a module isn'...
https://charm.cs.illinois.edu/gerrit/c/charm/+/5033 Evan Ramos
06:20 PM Cleanup #2054 (Merged): Improve error message for unregistered chares/entry methods for when a mo...
Evan Ramos
03:34 PM Cleanup #2054 (Resolved): Improve error message for unregistered chares/entry methods for when a ...
Venkatasubrahmanian Narayanan
12:05 PM Cleanup #2054 (Implemented): Improve error message for unregistered chares/entry methods for when...
Venkatasubrahmanian Narayanan
06:20 PM Documentation #1908 (Merged): Document PUP::able and associated macros
Evan Ramos
02:12 PM Documentation #1908 (Implemented): Document PUP::able and associated macros
Eric Mikida
02:12 PM Documentation #1908: Document PUP::able and associated macros
Documentation added in this patch: https://charm.cs.illinois.edu/gerrit/c/charm/+/5034 Eric Mikida
12:08 PM Bug #2046 (In Progress): TRAM doesn't appear to support higher dimensional chare arrays
Venkatasubrahmanian Narayanan
12:08 PM Bug #2052 (Rejected): Build script does not support thread sanitizer
It turns out that the problem is with the GCC version used to build Charm++ - GCC 4.x does not properly support the t... Venkatasubrahmanian Narayanan
11:05 PM Bug #2060: Communication performance degrades after succesive load balancing when migrating many ...
Just saw what you said about GLOBAL_LOCATION_UPDATE. That is surprising. After looking at the code, I had a few more ... Eric Mikida

03/20/2019

04:41 PM Feature #69: Get large messages sent via the ZC API to be received directly using a GET from the ...
I think Phil is trying to basically describe a “zero-copy” based transfer scheme for a large message (whose target is... Nitin Bhat
04:41 PM Bug #2060: Communication performance degrades after succesive load balancing when migrating many ...
Another experiment to try that would be particularly useful would be to only call LB once. We should see an initial s... Eric Mikida
02:24 PM Bug #2060: Communication performance degrades after succesive load balancing when migrating many ...
The issue happens also with CMK_GLOBAL_LOCATION_UPDATE.
Also, I told this to Eric Mikida offline, but for the reco...
Juan Galvez
01:37 PM Bug #2060: Communication performance degrades after succesive load balancing when migrating many ...
Eric Mikida wrote:
> How many chares are you using in this case? I know this debate has come up before, as to whethe...
Juan Galvez
01:19 PM Bug #2060: Communication performance degrades after succesive load balancing when migrating many ...
Laxmikant "Sanjay" Kale wrote:
> What happens with rotateLB? That sould clarify some issues with a more controlled s...
Eric Mikida
01:18 PM Bug #2060: Communication performance degrades after succesive load balancing when migrating many ...
How many chares are you using in this case? I know this debate has come up before, as to whether we should proactivel... Eric Mikida
01:05 PM Bug #2060: Communication performance degrades after succesive load balancing when migrating many ...
What happens with rotateLB? That sould clarify some issues with a more controlled scenario. Laxmikant "Sanjay" Kale
10:28 AM Bug #2060: Communication performance degrades after succesive load balancing when migrating many ...
I should add that in the test program, all chares have the same load, so the performance after load balancing should ... Juan Galvez
10:13 AM Bug #2060 (New): Communication performance degrades after succesive load balancing when migrating...
This was observed on Blue Waters, running on 128 physical nodes, with chares communicating in a stencil-like pattern ... Juan Galvez
04:17 PM Feature #2061 (New): Support parameter marshaling for section broadcast
Currently, section broadcasts are only supported with recipient entry methods having Charm++ messages as input parame... Nitin Bhat
02:37 PM Bug #2030: tests/ampi/megampi crashes in MPI_Comm_free
These global variables in TCharm and AMPI are potential candidates for causing this issue due to data races:... Evan Ramos
11:34 AM Support #2031 (In Progress): Add a new target (potentially benchmarks) for Vesta autobuilds to av...
Since the BGQ autobuilds began successfully running just yesterday, I am going to wait for about 10 days to ensure t... Nitin Bhat

03/19/2019

11:40 AM Bug #1957: Out of bounds std::vector accesses in NDMeshStreamer
Here is a patch for debugging that adds some printf tracing of the index value that goes out of bounds. Evan Ramos

03/18/2019

04:53 PM Documentation #1777 (Merged): Document message types in Charm's message hander (_processHandler i...
Nitin Bhat
11:59 AM Feature #2053 (Implemented): Avoid receiver side copy for zero-copy broadcast by allowing receive...
Gerrit: https://charm.cs.illinois.edu/gerrit/c/charm/+/4943/ Nitin Bhat
11:22 AM Bug #1957: Out of bounds std::vector accesses in NDMeshStreamer
Reassigning this since it appears to be a logic error in NDMeshStreamer. Evan Ramos

03/16/2019

09:15 PM Bug #2059: Inconsistent CPU affinity options for running SMP programs
The verbs builds should *not* report running mpi. Jim Phillips
09:04 PM Bug #2059: Inconsistent CPU affinity options for running SMP programs
You need to at least show the options with which you submitted the job, since mpirun often picks up environment varia... Jim Phillips

03/15/2019

05:59 PM Bug #2030 (In Progress): tests/ampi/megampi crashes in MPI_Comm_free
Evan Ramos
05:58 PM Bug #2030: tests/ampi/megampi crashes in MPI_Comm_free
I ran megampi on Windows with a Microsoft tool called "Application Verifier":https://docs.microsoft.com/en-us/windows... Evan Ramos
03:00 PM Bug #2030: tests/ampi/megampi crashes in MPI_Comm_free
Can you post the tsan output here? Sam White
02:32 PM Bug #2030: tests/ampi/megampi crashes in MPI_Comm_free
I tried running megampi on Linux with ThreadSanitizer and the list of data races was substantial. Some of them look l... Evan Ramos
03:45 PM Bug #2059: Inconsistent CPU affinity options for running SMP programs
I'm wondering if this is a regression from the introduction of hwloc. Evan Ramos
11:27 AM Bug #2059 (New): Inconsistent CPU affinity options for running SMP programs
It seems to me that on different machines (and different allocations/partitions), to get Charm++ SMP threads to be pi... Nitin Bhat
03:36 PM Bug #1949: Ensure that 'End of Program' message is printed consistently for every charm program e...
This sounds similar to https://charm.cs.illinois.edu/gerrit/c/charm/+/913 Evan Ramos
12:45 PM Bug #1957: Out of bounds std::vector accesses in NDMeshStreamer
Updating this issue to reflect its general nature as opposed to being Windows-specific.
The bug can be reproduced ...
Evan Ramos
09:59 AM Bug #1671: Verbs memory pool may leak pinned memory when message is deleted on a PE different fro...
I will try to reproduce this issue with a simple example. Nitin Bhat
09:53 AM Documentation #1777 (Implemented): Document message types in Charm's message hander (_processHand...
Gerrit: https://charm.cs.illinois.edu/gerrit/c/charm/+/5022 Nitin Bhat

03/14/2019

04:27 PM Bug #1688: Core Dump file not available unless `--disable-charmdebug` is used while building.
I agree, it's not clear why Cmi_truecrash is not enabled by default. It could have been for terminating gracefully be... Kavitha Chandrasekar
02:24 PM Bug #1688: Core Dump file not available unless `--disable-charmdebug` is used while building.
Does anyone know why @Cmi_truecrash = 0;@ is the default -- or implemented at all? Why should crashes be hidden from ... Evan Ramos
02:47 PM Feature #1974: nocopy accelerated section multicast
From Core Meeting on 14th March:
Raghavendra, Nitin, and Juan should discuss this feature and estimate the effort i...
Nitin Bhat
02:44 PM Feature #1040: support multiple InfiniBand cards per node
Jim Phillips wrote:
> Charm++ currently attaches to the first interface of the first card it finds.
After this ch...
Evan Ramos
02:40 PM Feature #1987: Take advantage of streamable reductions inside CkMulticast
raghavendra, pleased pay attention to this during this week. Laxmikant "Sanjay" Kale
02:30 PM Bug #1940: Singleton chare and nodegroup creation hangs with randomized queues in SMP mode
Michael, make at least some update (test with a simple program on a couple of machines) by next week. Laxmikant "Sanjay" Kale
02:18 PM Bug #1957 (In Progress): Out of bounds std::vector accesses in NDMeshStreamer
Evan Ramos
02:15 PM Support #2031: Add a new target (potentially benchmarks) for Vesta autobuilds to avoid maximum ex...
This bug should be reassigned to a PPL member with BG/Q access Evan Ramos
02:08 PM Feature #2058 (In Progress): Finish work on newLB branch and merge with master
Juan Galvez
01:24 PM Feature #2058 (In Progress): Finish work on newLB branch and merge with master
Currently tracking tasks here:
https://docs.google.com/document/d/1Amt4JvkfqMAerUk2w4HdvAbB4ySi2lq9mXvhaqQ4ChI/edit?...
Juan Galvez
11:52 AM Cleanup #2057 (Implemented): Revise agreement to cover github distribution and license to address...
Usage of Charm++ in a commercial context in a non-distributed, or purely internal, use case should be explicitly cons... Eric Bohm

03/06/2019

03:20 PM Bug #1957: Out of bounds std::vector accesses in NDMeshStreamer
Running the @streamingAllToAll@ test case on Linux with ASan also exposes these issues, and printf traces of the valu... Evan Ramos
02:31 PM Bug #1957: Out of bounds std::vector accesses in NDMeshStreamer
This problem is caused by indexing -1 into a vector.... Evan Ramos
11:11 AM Bug #2030: tests/ampi/megampi crashes in MPI_Comm_free
I managed to catch this crash in Visual Studio's debugger. @ampi::getRank()@ is called with @this@ pointing to garbag... Evan Ramos
 

Also available in: Atom