Project

General

Profile

Feature #1420

Lockless queue build option --enable-lockless-queue (off by default)

Added by Bilge Acun over 2 years ago. Updated about 1 year ago.

Status:
Merged
Priority:
Normal
Assignee:
Category:
SMP
Target version:
Start date:
02/14/2017
Due date:
% Done:

0%


Description

Implementation of the new lockless queues: https://charm.cs.illinois.edu/gerrit/#/c/1302
Performance tests from various benchmarks can be found here: https://docs.google.com/spreadsheets/d/1iHXF-_tSW-plqwZOYIcIRBOxHSOMgNx96i-fQi4P-Zo/edit#gid=0

History

#1 Updated by Sam White about 2 years ago

  • Target version set to 6.9.0
  • Category set to SMP

#2 Updated by Eric Bohm about 2 years ago

  • Assignee set to Bilge Acun

#3 Updated by Bilge Acun almost 2 years ago

  • Status changed from New to Implemented

#4 Updated by Eric Bohm over 1 year ago

  • Assignee changed from Bilge Acun to Seonmyeong Bak

#5 Updated by Seonmyeong Bak over 1 year ago

I'll add a build option to make this lockless queue implementation available for those who want to test this.

One thing I found through brief review is that the current lockless queue implementation uses atomic operations a lot.
For example, Push and Pop uses 3 atomic operations with 'memory_order_release' at the most.

I think this lockless queue can be implemented with less number of 'memory_order_release' operations.
On x86, memory_order_acquire may be compiled to normal memory operation. (because of x86's strong memory consistency).
However, memory_order_release creates memory barrier, which costs much more cycles than normal operations and prevent memory optimizations siginificantly.

I think the followings are required to improve this implementation.

1. Understand the common lock-less queue algorithms through papers.
2. Consider to use common c++ lock-less queue implementations(e.g. boost/lockfree/queue.hpp -> MPMC queue). Or compare its implementation in detail.
3. Analyze the overheads and cycles in each API of the lockless queue implementation. V-tune or PAPI may help.

#6 Updated by Ronak Buch over 1 year ago

For whoever is going to evaluate the performance against the Boost Queue, they should also evaluate against moodycamel's queue (http://moodycamel.com/blog/2014/a-fast-general-purpose-lock-free-queue-for-c++).

#7 Updated by Seonmyeong Bak over 1 year ago

  • Subject changed from Lockless Queues to New lockless queue implementation and integration

#8 Updated by Seonmyeong Bak over 1 year ago

All changes are enveloped in "CMK_LOCKLESS_QUEUE" flag.
This compile flag can be enabled by "--enable-lockless-queue".
The flag is turned off by default.

I cannot find appropriate page on the manual to add explanation of this experimental option. (We don't have a separate page for the build options.)

#9 Updated by Sam White over 1 year ago

  • Subject changed from New lockless queue implementation and integration to Lockless queue build option --enable-lockless-queue (off by default)

#10 Updated by Evan Ramos about 1 year ago

I've updated the patch to use C++11 atomics.

#11 Updated by Sam White about 1 year ago

  • Assignee changed from Seonmyeong Bak to Evan Ramos

Evan has already been working on this, and Seonmyeong is leaving PPL next week.

#12 Updated by Sam White about 1 year ago

  • Status changed from Implemented to Merged

Merged the patch that adds the lockless queue but leaves it off by default. To enable it, build with '--enable-lockless-queue'.

Also available in: Atom PDF