Deterministic parallelism is a major building block for distributed and fault-tolerant systems, offering substantial performance benefits with its usecases extending to debugging and testing as well. By studying recent Deterministically Parallel Systems (DPS), we identify several architectural limitations that hinder their performance and efficiency. As a result, existing DPS cannot keep up with the high-throughput and 𝜇𝑠-scale distributed systems that are deployed in modern datacenters today.

We introduce DORADD, a high-performance deterministically parallel runtime, which is tailored for modern datacenter applications. DORADD takes a proactive approach to enforce deterministic parallel execution entirely through scheduling, thus allowing requests to run to completion in a synchronization-free manner. It dynamically captures inter-request dependencies and schedules requests on the available CPU resources in a work-conserving manner. It proposes a novel mechanism, i.e., core pipelining, to scale the single-dispatcher performance. We build an in-memory database with DORADD and compare it with Caracal, the current state-of-the-art deterministic database, using the YCSB and TPC-C benchmarks. Our evaluation shows up to 2.5x better throughput and more than 150x and 300x better tail latency in non-contended and contended cases. We also compare DORADD with Caladan, the state-of-the-art non-deterministic scheduler, showing that determinism does not induce any throughput overhead under latency SLAs.

Tue 4 Mar

Displayed time zone: Pacific Time (US & Canada) change

11:20 - 12:20
Session 7: Scheduling and Resource Management (Session Chair: Jie Ren)Main Conference at Acacia D
11:20
20m
Talk
SGDRC: Software-Defined Dynamic Resource Control for Concurrent DNN Inference on NVIDIA GPUs
Main Conference
Yongkang Zhang HKUST, Haoxuan Yu HKUST, Chenxia Han CUHK, Cheng Wang Alibaba Group, Baotong Lu Microsoft Research, Yunzhe Li Shanghai Jiao Tong University, Zhifeng Jiang HKUST, Yang Li China University of Geosciences, Xiaowen Chu Data Science and Analytics Thrust, HKUST(GZ), Huaicheng Li Virginia Tech
11:40
20m
Talk
DORADD: Deterministic Parallel Execution in the Era of Microsecond-Scale Computing
Main Conference
Scofield Liu Imperial College London, Musa Unal EPFL, Matthew J. Parkinson Microsoft Azure Research, Marios Kogias Imperial College London; Microsoft Research
12:00
20m
Talk
WaterWise: Co-optimizing Carbon- and Water-Footprint Toward Environmentally Sustainable Cloud Computing
Main Conference
Yankai Jiang Northeastern University, Rohan Basu Roy Northeastern University, Raghavendra Kanakagiri Indian Institute of Technology Tirupati, Devesh Tiwari Northeastern University