Fairer and More Scalable Reader-Writer Locks by Optimizing Queue Management
MCS lock and its variants provide scalability on many-core architectures, using lists of lock requests to reduce access contention on the mutex data. Most recent variants have adopted a two-stage design, allowing requests to be allocated from stack memory rather than heap. However, this design still produces contention in mutex access and limits fairness when using fast paths. This paper proposes \textit{Freezer} mechanism and its optimization methods, which extend the list structure operations of MCS lock, to achieve reducing mutex access without using heap memory and enabling independent choice of fairness policies and use of fast paths. Additionally, we propose four optimization methods for queue-based reader-writer locks. Our evaluation using three benchmarks confirmed the effectiveness of the proposed fair reader-writer locks. They achieved up to 3.4$\times$ higher throughput and improved tail latency by up to 2.9$\times$ compared to baselines.