Avoid intermediate ctx to scheduler in case of ULTs
If a ULT yields, we currently context switch back to the scheduler thread always, even if the next task in the scheduler's queue is another ULT that we then have to context switch to.
ArgoBots already has this optimization to context switch directly to the next ULT in this case.
#2 Updated by Sam White about 1 month ago
- Assignee changed from Sam White to Seonmyeong Bak
Seonmyeong, can you take a look at this? You have more knowledge of our ULTs and Csd module. The idea here is just to avoid the intermediate context switch to the scheduler in the case where the currently running thing is a ULT and the next thing in the scheduler is also a ULT.
#3 Updated by Seonmyeong Bak about 1 month ago
Maybe, the current implementation tries to follow the priority order of the converse queues.
If we want to process all the ULTs in the specific queue, we may need to consider it is worth changing the order.
Currently, OpenMP integration works in the way you described. It process all the ULTs in TaskQueue within each OpenMP parallel region before it comes back to the converse scheduler.
We may construct a common function to process all the ULTs in a specific queue until the queue is empty.
For OpenMP, taskqueue, and For AMPI, maybe scheduler Queue?
#4 Updated by Sam White about 1 month ago
I hadn't thought of that, but it might be good. I meant that if a ULT is running and is about to suspend, then we should check if whatever the next thing that will be scheduled is also a ULT. If so, do the direct context switch. In any other case, context switch back to the scheduler thread. That would require being able to run the main scheduler loop on the ULT though.
Also, if it matters AMPI doesn't use message priorities at all. All of its messages are [expedited].
#5 Updated by Seonmyeong Bak about 1 month ago
I think only function checking the queues in the converse queue efficiently is enough for your purpose. If all the queues are empty and ULTs come next, we can schedule ULT without switching to the converse scheduler.
Lightweight version of CsdNextMessage (Not pop message but check if the queues are empty) can work for our purpose here.
Or, we can adopt the way I suggest. checking all the queues also cause overhead comparable to what switching to the main scheduler causes.
Currently, the context switching cost is reduced a lot by the uFcontext. What we suggest here may not be more efficient.