Cleanup #535: Errors reported by ThreadSanitizer
Data race in ConverseExit
SUMMARY: ThreadSanitizer: data race ./machine-common-core.c:1350 ConverseExit // comm thread reads without lock
#2 Updated by Phil Miller over 4 years ago
Any circumstance in which a variable is read in one thread and written in another, in which at least one of them is not synchronized, is a race condition according to the language standards. Nikhil was concerned that locking on this variable in the comm thread risks hurting performance severely. Phil expects that the performance impact will actually be negligible, because the lock should remain uncontended in the cache of the comm thread's cores until the runtime exits. If benchmarks are happy with locking, then we should just do that.
#3 Updated by Phil Miller over 4 years ago
The solution we came up with in the last core meeting was to have the comm thread allocate and send a 'exiting now' message to each of its worker threads, and have the handler for that message set each thread's local flag to 'exiting'. This eliminates the shared variables entirely.
This can probably also be deferred past 6.6.1.
#16 Updated by Sam White about 1 year ago
- Status changed from New to In Progress
Converted the volatile int + CmiNodeLock to a std::atomic<int>, but this may well hurt performance. So we need to benchmark it: https://charm.cs.illinois.edu/gerrit/#/c/charm/+/4103/