关键词:
Latency
Energy-efficiency
Queueing
Lock-free
Fetch-and-add
Multi-cores
Data centers
摘要:
Energy-efficient mechanisms for reducing the latency in queuing of multiple threads running on multi-core chips have been a topic of great interest. This is because not only is a high latency undesirable, but it also collectively exacerbates the energy consumption. For lowering latency, one can use techniques such as lock-free algorithms, but they keep threads spinning, incurring high CPU usage, which in turn consumes higher energy. Common blocking synchronization primitives such as mutual exclusion locks or semaphores may be more energy-efficient, but their performance can be poor because they incur high latency. This paper proposes a new approach that combines a lock-free algorithm with resource efficiency of blocking synchronization primitives. The algorithm, named eLCRQ, is implemented as queueing scheme that uses the lightweight Linux Futex system call to construct a block-when-necessary layer on top of the popular lock-free LCRQ. The algorithm judiciously uses the blockwhen-necessary principle, which results in a close to lock-free performance under contention. For no-contention conditions, we use the Futex System call for conditional blocking instead of spinning in a retry loop. The advantage of this scheme is that it releases the CPU, allowing it to perform other tasks without wasting its energy on useless spinning. We analyzed the performance of our scheme on a heterogeneous platform and with varying levels loads. We also compared the proposed scheme with other well-known IPC mechanisms under various settings. Our experimental results illustrate that eLCRQ-spin achieves better latency and higher energy reduction.