Feature #1575

The OpenMP integration modified to run on Converse user-level threads

Added by Seonmyeong Bak 4 months ago. Updated 4 months ago.

Target version:
Start date:
Due date:
% Done:



In the original implementation, OpenMP integration is implemented by Converse message, which is kind of stalkless user level threads.

However, to support all the OpenMP features, we need full user level threads to resume and suspend the OpenMP works on Converse.
OpenMP integration is modified to run on converse user-level threads. They can be suspended and resumed so more features of OpenMP can be implemented.

The user level threads are maintained in OpenMP runtime and they're reused because they can be suspended and resumed.

Each user level thread is suspended in a while loop when it finishes its work in a OpenMP region. So, this thread can be assigned to any OpenMP team and resumed.

This patch is based on ucontext_t based converse threads. I think this is the only implementation used these days.


#1 Updated by Seonmyeong Bak 4 months ago

  • Assignee set to Seonmyeong Bak

#2 Updated by Sam White 4 months ago

src/arch/gni/conv-mach-smp.h:#define CMK_THREADS_USE_CONTEXT                            1
src/arch/gni-crayxc/conv-mach.h:#define CMK_THREADS_USE_CONTEXT                            1
src/arch/gni-crayxe/conv-mach.h:#define CMK_THREADS_USE_CONTEXT                            1
src/arch/mpi/conv-mach-pthreads.h:#define CMK_THREADS_USE_PTHREADS                           1
src/arch/mpi-bluegeneq/conv-mach.h:#define CMK_THREADS_USE_JCONTEXT                           1
src/arch/mpi-crayxc/conv-mach-smp.h:#define CMK_THREADS_USE_JCONTEXT                           1
src/arch/mpi-crayxc/conv-mach.h:#define CMK_THREADS_USE_CONTEXT                            1
src/arch/mpi-crayxe/conv-mach-smp.h:#define CMK_THREADS_USE_JCONTEXT                           1
src/arch/mpi-crayxe/conv-mach.h:#define CMK_THREADS_USE_CONTEXT                            1
src/arch/mpi-darwin-x86_64/conv-mach-smp.h:#define CMK_THREADS_USE_JCONTEXT                           1
src/arch/mpi-darwin-x86_64/conv-mach.h:#define CMK_THREADS_USE_JCONTEXT                           1
src/arch/mpi-linux/conv-mach-gm.h:#define CMK_THREADS_USE_CONTEXT                   1
src/arch/mpi-linux/conv-mach-gm2.h:#define CMK_THREADS_USE_PTHREADS               1
src/arch/mpi-linux-mips64/conv-mach.h:#define CMK_THREADS_USE_PTHREADS                           1
src/arch/mpi-linux-ppc/conv-mach.h:#define CMK_THREADS_USE_CONTEXT                            1
src/arch/mpi-linux-x86_64/conv-mach-gm2.h:#define CMK_THREADS_USE_PTHREADS               1
src/arch/mpi-linux-x86_64/conv-mach.h:#define CMK_THREADS_USE_CONTEXT                            1
src/arch/multicore-darwin-x86_64/conv-mach.h:#define CMK_THREADS_USE_JCONTEXT                           1
src/arch/multicore-linux-ppc/conv-mach.h:#define CMK_THREADS_USE_CONTEXT                            1
src/arch/multicore-linux32/conv-mach.h:#define CMK_THREADS_USE_CONTEXT                            1
src/arch/multicore-linux64/conv-mach.h:#define CMK_THREADS_USE_CONTEXT                            1
src/arch/net-darwin-x86_64/conv-mach-smp.h:#define CMK_THREADS_USE_JCONTEXT                           1
src/arch/net-darwin-x86_64/conv-mach.h:#define CMK_THREADS_USE_JCONTEXT                           1
src/arch/net-linux-arm7/conv-mach-ibverbs.h:#define CMK_THREADS_USE_JCONTEXT                           1
src/arch/net-linux-arm7/conv-mach-pthreads.h:#define CMK_THREADS_USE_PTHREADS                           1
src/arch/net-linux-ppc/conv-mach-ibverbs.h:#define CMK_THREADS_USE_JCONTEXT                           1
src/arch/net-linux-ppc/conv-mach.h:#define CMK_THREADS_USE_CONTEXT                            1
src/arch/net-linux-x86_64/conv-mach.h:#define CMK_THREADS_USE_CONTEXT                            1
src/arch/netlrts-darwin-x86_64/conv-mach-smp.h:#define CMK_THREADS_USE_JCONTEXT                           1
src/arch/netlrts-darwin-x86_64/conv-mach.h:#define CMK_THREADS_USE_JCONTEXT                           1
src/arch/netlrts-linux-arm7/conv-mach-ibverbs.h:#define CMK_THREADS_USE_JCONTEXT                           1
src/arch/netlrts-linux-arm7/conv-mach-pthreads.h:#define CMK_THREADS_USE_PTHREADS                           1
src/arch/netlrts-linux-ppc/conv-mach.h:#define CMK_THREADS_USE_CONTEXT                            1
src/arch/netlrts-linux-x86_64/conv-mach.h:#define CMK_THREADS_USE_CONTEXT                            1
src/arch/pami-bluegeneq/conv-mach.h:#define CMK_THREADS_USE_JCONTEXT                           1
src/arch/pami-linux-ppc64le/conv-mach.h:#define CMK_THREADS_USE_CONTEXT                            1
src/arch/pamilrts-bluegeneq/conv-mach.h:#define CMK_THREADS_USE_JCONTEXT                           1
src/arch/uth-linux-x86_64/conv-mach.h:#define CMK_THREADS_USE_CONTEXT                            1
src/arch/verbs-linux-ppc64le/conv-mach.h:#define CMK_THREADS_USE_CONTEXT                            1
src/arch/verbs-linux-x86_64/conv-mach.h:#define CMK_THREADS_USE_CONTEXT                            1

And all -arm7 builds use QuickThreads (QuickThreads is used if none of the CMK_THREADS_USE_* options are defined to 1 in the conv-mach.h file.

So on platforms we really care about today, it's mostly a mix of context and uJcontext, but is there any technical reason (beyond this was easiest) as to why it's only done for context/uJcontext? Some day someone may add another threads implementation to Converse, and this make the requirements we make of threads packages inconsistent..

#3 Updated by Seonmyeong Bak 4 months ago

I just started this feature from most frequently used implementation.
Even thought Converse threads use common APIs, they have separate implementations for each API. More specifically, they have their own implementation for CthResume.
I can do the same work on Qthreads later if needed.

In addition, when we support Qthreads, we should measure the cost of context switching between Qthreads. I think Qthreads takes much more time to do context switching than ucontext_t.

#4 Updated by Seonmyeong Bak 4 months ago

This implementation adds a CMIQueue for suspended OpenMP user level threads.

When OpenMP ULTs are suspended and resumed, they are pushed to the suspended task queue of PE where they were suspended before.
The need for a separate suspended queue instead of using the existing PE local queue is that these OpenMP ULTs should be finished before picking a message for another chare.
In other words, if the tasks are pushed to the existing PE local queue then the entry method where the OpenMP region can be switched to other chares while it waits for its OpenMP ULTs to be finished. This break the consistency of Charm++. With the separate queue, the entry method having on-going OpenMP

ULTs are stolen by the same work stealing algorithm we used before and we can use these ULTs for other cases. I'll make these be able to be used for other implementations such as AMPI.

#5 Updated by Phil Miller 4 months ago

Implementation here: (for ease of cross-reference)

#6 Updated by Seonmyeong Bak 4 months ago

For mac, ujcontext is set by default but setjmp/longjmp cannot guarantee the context switching of ULTs across kernel level threads.

So, we should find another implementation for mac. Even ucontext_t doesn't work on Mac in the same way it does on Linux. QT also doesn't work.

Also available in: Atom PDF