关键词:
Dynamic memory allocation
memory pooling
multi-threading
parallel program
scalable heap implementation
shared memory
small buffer optimization
摘要:
Frequent dynamic memory allocations (DyMAs) can significantly hinder the scalability of parallel multi-threaded programs. As the number of threads grows, DyMAs can even become the main performance bottleneck. We introduce modern tools and methods for evaluating the impact of DyMAs and present techniques for its reduction, which include scalable heap implementations, small buffer optimization, and memory pooling. Additionally, we provide a survey of state-of-the-art implementations of these techniques and study them experimentally by using a benchmark program, server simulator software, and a real-world high-performance computing application. As a result, we show that relatively small modifications in parallel program's source code or a way of its execution may substantially reduce the runtime overhead associated with the use of dynamic data structures.