PPoPP 2025 - Main Conference

PPoPP is the premier forum for leading work on all aspects of parallel programming, including theoretical foundations, techniques, languages, compilers, runtime systems, tools, and practical experience. In the context of the symposium, “parallel programming” encompasses work on concurrent and parallel systems (multicore, multi-threaded, heterogeneous, clustered, and distributed systems; grids; datacenters; clouds; and large scale machines). Given the rise of parallel architectures in the consumer market (desktops, laptops, and mobile devices) and data centers, PPoPP is particularly interested in work that addresses new parallel workloads and issues that arise out of extreme-scale applications or cloud platforms, as well as techniques and tools that improve the productivity of parallel programming or work towards improved synergy with such emerging architectures.

Proceedings will be available in the ACM Digital Library.

Time Zone

The program is currently displayed in (GMT-08:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-08:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

You're viewing the program in a time zone which is different from your device's time zone change time zone

Sat 1 Mar
Displayed time zone: Pacific Time (US & Canada) change

07:30 - 08:30	BreakCatering / Main Conference at Casuarina Ballroom

07:30 60m Break		Breakfast Catering

10:00 - 10:30	BreakCatering / Main Conference at Pre-F

10:00 30m Coffee break		Break Catering

15:00 - 15:30	BreakCatering / Main Conference at Pre-F

15:00 30m Coffee break		Break Catering

Sun 2 Mar
Displayed time zone: Pacific Time (US & Canada) change

07:30 - 08:30	BreakfastCatering / Main Conference at Casuarina Ballroom

07:30 60m Break		Breakfast Catering

10:00 - 10:30	BreakCatering / Main Conference at Pre-F

10:00 30m Coffee break		Break Catering

15:00 - 15:30	BreakCatering / Main Conference at Pre-F

15:00 30m Coffee break		Break Catering

18:00 - 20:00	Poster SessionMain Conference at Acacia C&D

18:00 2h Poster		POSTER: A General and Scalable GCN Training Framework on CPU Supercomputers Main Conference Chen Zhuang Tokyo Institute of Technology, Riken Center for Computational Science, Peng Chen National Institute of Advanced Industrial Science and Technology, Xin Liu National Institute of Advanced Industrial Science & Technology, Rio Yokota Tokyo Institute of Technology, Nikoli Dryden Lawrence Livermore National Laboratory, Toshio Endo Tokyo Institute of Technology, Satoshi Matsuoka RIKEN, Mohamed Wahib RIKEN Center for Computational Science
18:00 2h Poster		POSTER: Triangle Counting on Tensor Cores Main Conference YuAng Chen The Chinese University of Hong Kong, Jeffrey Xu Yu The Chinese University of Hong Kong
18:00 2h Poster		POSTER: Minimizing speculation overhead in a parallel recognizer for regular texts Main Conference Angelo Borsotti Politecnico di Milano, Luca Breveglieri Politecnico di Milano, Stefano Crespi Reghizzi Politecnico di Milano and CNR-EIIT, Angelo Morzenti Politecnico di Milano
18:00 2h Poster		POSTER: Boost Lock-free Queue and Stack with Batching Main Conference Ao Li Wuhan University, Wenhai Li Wuhan University, Yuan Chen Wuhan University, Lingfeng Deng Wuhan University
18:00 2h Poster		POSTER: Frontier-guided Graph Reordering Main Conference Xinmiao Zhang SKLP, Institute of Computing Technology, CAS, Cheng Liu ICT CAS, Shengwen Liang SKLP, Institute of Computing Technology, CAS, Chenwei Xiong SKLP, Institute of Computing Technology, CAS, Yu Zhang School of Computer Science and Technology, Huazhong University of Science and Technology, Lei Zhang ICT CAS, Huawei Li SKLP, Institute of Computing Technology, CAS, Xiaowei Li SKLP, Institute of Computing Technology, CAS
18:00 2h Poster		POSTER: Big Atomics and Fast Concurrent Hash Tables Main Conference Daniel Anderson Carnegie Mellon University, Guy E. Blelloch Carnegie Mellon University, USA, Siddhartha Jayanti Google Research
18:00 2h Poster		POSTER: FastBWA: Practical and Cost-Efficient Genome Sequence Alignment Pipeline Main Conference Zhonghai Zhang Institute of Computing Technology, Chinese Academy of Sciences / University of Chinese Academy of Sciences, Yewen Li Institute of Computing Technology, Chinese Academy of Sciences / University of Chinese Academy of Sciences, Ke Meng Chinese Academy of Sciences, Chunming Zhang Institute of Computing Technology, Chinese Academy of Sciences, Guangming Tan Chinese Academy of Sciences(CAS)
18:00 2h Poster		POSTER: Transactional Data Structures with Orthogonal Metadata Main Conference Yaodong Sheng Lehigh University, Ahmed Hassan Lehigh University, Michael Spear Lehigh University
18:00 2h Poster		POSTER: High-performance Visual Semantics Compression for AI-Driven Science Main Conference Boyuan Zhang Indiana University, Luanzheng Guo Pacific Northwest National Laboratory, Jiannan Tian Indiana University, Jinyang Liu University of California, Riverside, Daoce Wang Indiana University, Fanjiang Ye Indiana University, Chengming Zhang University of Alabama, Jan Strube Pacific Northwest National Laboratory, Nathan R. Tallent Pacific Northwest National Laboratory, Dingwen Tao Institute of Computing Technology, Chinese Academy of Sciences
18:00 2h Poster		POSTER: Magneto: Accelerating Parallel Structures in DNNs via Co-Optimization of Operators Main Conference Zhanyuan Di State Key Lab of Processors, Institute of Computing Technology, CAS, Leping Wang State Key Lab of Processors, Institute of Computing Technology, CAS, Beijing, Ziyi Ren State Key Lab of Processors, Institute of Computing Technology, CAS, En Shao State Key Lab of Processors, Institute of Computing Technology, CAS, Beijing, Jie Zhao Hunan University, Siyuan Feng Shanghai Jiao Tong University, Dingwen Tao Institute of Computing Technology, Chinese Academy of Sciences, Guangming Tan Chinese Academy of Sciences(CAS), Ninghui Sun State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences
18:00 2h Poster		POSTER: TENSORMD: Molecular Dynamics Simulation with Ab Initio Accuracy of 50 Billion Atoms Main Conference Yucheng Ouyang Institute of Computing Technology, Chinese Academy of Sciences, Xin Chen Institute of Applied Physics and Computational Mathematics, Ying Liu Institute of Computing Technology, Chinese Academy of Sciences, Xin Chen , Honghui Shang Institute of Computing Technology, Chinese Academy of Sciences, Zhenchuan Chen Institute of Computing Technology, Chinese Academy of Sciences, Rongfen Lin National Research Center of Parallel Computer Engineering and Technology, Xingyu Gao Institute of Applied Physics and Computational Mathematics, Lifang Wang Institute of Applied Physics and Computational Mathematics, Fang Li National Research Center of Parallel Computer Engineering and Technology, Jiahao Shan Institute of Computing Technology, Chinese Academy of Sciences, Haifeng Song Institute of Applied Physics and Computational Mathematics, Huimin Cui Institute of Computing Technology, Chinese Academy of Sciences, Xiaobing Feng ICT CAS

18:00 - 20:00	ReceptionCatering / Main Conference at Acacia C&D

18:00 2h Other		Reception Catering

Mon 3 Mar
Displayed time zone: Pacific Time (US & Canada) change

07:30 - 08:30	BreakfastCatering / Main Conference at Bristlecone & Cypress, Mesquite 1-5

07:30 60m Break		Breakfast Catering

08:30 - 09:30	Keynote (Session Chair: Rezaul Chowdhury)Keynotes at Acacia A&B

08:30 60m Keynote		Setting a Course for Post-Moore Software Performance Keynotes K: Charles E. Leiserson MIT

09:30 - 10:00	BreakCatering / Main Conference at Pre-F

09:30 30m Coffee break		Break Catering

10:00 - 11:00	Session 1: Graph Neural Networks (Session Chair: Miao Yin)Main Conference at Acacia D

10:00 20m Talk		Helios: Efficient Distributed Dynamic Graph Sampling for Online GNN Inference Main Conference Jie Sun Zhejiang University, Zuocheng Shi Zhejiang University, Li Su Alibaba Group, Wenting Shen Alibaba Group, Zeke Wang Zhejiang University, Yong Li Alibaba Group, Wenyuan Yu Alibaba Group, Wei Lin Alibaba Group, Fei Wu College of Computer Science and Technology in Zhejiang University, Jingren Zhou Alibaba Group, Bingsheng He National University of Singapore
10:20 20m Talk		Accelerating GNNs on GPU Sparse Tensor Cores through N:M Sparsity-Oriented Graph Reordering Main Conference Jou-An Chen North Carolina State University, Hsin-Hsuan Sung North Carolina State University, Ruifeng Zhang North Carolina State University, Ang Li Pacific Northwest National Laboratory, Xipeng Shen North Carolina State University
10:40 20m Talk		Adaptive Parallel Training for Graph Neural Networks Main Conference Kaihao Ma The Chinese University of Hong Kong, Renjie Liu Southern University of Science and Technology, Xiao Yan Centre for Perceptual and Interactive Intelligence (CPII), Zhenkun Cai Amazon, Xiang Song Amazon Web Services, Minjie Wang Amazon Web Services, Yichao Li The Chinese University of Hong Kong, James Cheng The Chinese University of Hong Kong

11:00 - 11:20	BreakCatering / Main Conference at Pre-F

11:00 20m Coffee break		Break Catering

11:20 - 12:20	Session 2: GPU I (Session Chair: Xipeng Shen)Main Conference at Acacia D

11:20 20m Talk		RT–BarnesHut: Accelerating Barnes–Hut Using Ray-Tracing Hardware Main Conference Vani Nagarajan Purdue University, Rohan Gangaraju Purdue University, Kirshanthan Sundararajah Virginia Tech, Artem Pelenitsyn Purdue University, Milind Kulkarni Purdue University
11:40 20m Talk		EVeREST: An Effective and Versatile Runtime Energy Saving Tool for GPUsDistinguished Paper Award Main Conference Anna Yue University of Minnesota at Twin Cities, Pen-Chung Yew University of Minnesota at Twin Cities, Sanyam Mehta HPE
12:00 20m Talk		TurboFFT: Co-Designed High-Performance and Fault-Tolerant Fast Fourier Transform on GPUs Main Conference Shixun Wu , Yujia Zhai NVIDIA Corporation, Jinyang Liu University of California, Riverside, Jiajun Huang University of California, Riverside, Zizhe Jian University of California, Riverside, Huangliang Dai University of California, Riverside, Sheng Di Argonne National Laboratory, Franck Cappello Argonne National Laboratory, zizhong chen University of California, Riverside

12:20 - 14:00	LunchCatering / Main Conference at Bristlecone & Cypress, Mesquite 1-5

12:20 1h40m Lunch		Lunch Catering

14:00 - 15:20	Session 3: Concurrent Data Structures and Synchronization I (Session Chair: Yuanhao Wei)Main Conference at Acacia D

14:00 20m Talk		Reciprocating Locks Main Conference Dave Dice Oracle Labs, Alex Kogan Oracle Labs, USA
14:20 20m Talk		Aggregating Funnels for Faster Fetch&Add and Queues Main Conference Younghun Roh MIT, Yuanhao Wei University of British Columbia, Eric Ruppert York University, Panagiota Fatourou FORTH ICS and University of Crete, Greece, Siddhartha Jayanti Google Research, Julian Shun MIT
14:40 20m Talk		Fairer and More Scalable Reader-Writer Locks by Optimizing Queue Management Main Conference Takashi Hoshino Cybozu Labs, Inc., Kenjiro Taura The University of Tokyo
15:00 20m Talk		Publish on Ping: A Better Way to Publish Reservations in Memory Reclamation for Concurrent Data StructuresDistinguished Paper Award Main Conference Ajay Singh FORTH ICS and University of Waterloo, Trevor Brown University of Toronto

15:20 - 15:40	BreakCatering / Main Conference at Pre-F

15:20 20m Coffee break		Break Catering

15:40 - 16:40	Session 4: Memory (Session Chair: Dong Li)Main Conference at Acacia D

15:40 20m Talk		AC-Cache: A Memory-Efficient Caching System for Small Objects via Exploiting Access Correlations Main Conference Fulin Nan Xiamen Univeristy, Zhirong Shen Xiamen University
16:00 20m Talk		Effectively Virtual Page Prefetching via Spatial-Temporal Patterns for Memory-intensive Cloud Applications Main Conference Yun Wang Shanghai Jiao Tong University, Liang Chen , Tianmai Deng Shanghai Jiao Tong University, Ben Luo Alibaba Group, Yibin Shen Alibaba Cloud, Zhixiang Wei Shanghai Jiao Tong University, Yixiao Xu Shanghai Jiao Tong University, Minglang Huang Shanghai Jiao Tong University, Zhengwei Qi Shanghai Jiao Tong University
16:20 20m Talk		Harnessing Inter-GPU Shared Memory for Seamless MoE Communication-Computation Fusion Main Conference Hulin Wang , Yaqi Xia Wuhan University, Donglin Yang Nvidia Corporation, Xiaobo Zhou University of Macau, Dazhao Cheng WuHan University

16:40 - 17:00	BreakCatering / Main Conference at Pre-F

16:40 20m Coffee break		Break Catering

17:00 - 18:00	Session 5: Deep Neural Networks (Session Chair: Wei Niu)Main Conference at Acacia D

17:00 20m Talk		FlashTensor: Optimizing Tensor Programs by Leveraging Fine-grained Tensor Property Main Conference Runxin Zhong Tsinghua University, Yuyang Jin Tsinghua University, Chen Zhang Tsinghua University, Kinman Lei Tsinghua University, Shuangyu Li Tsinghua University, Jidong Zhai Tsinghua University
17:20 20m Talk		Mario: Near Zero-cost Activation Checkpointing in Pipeline Parallelism Main Conference Weijian Liu Institute of Computing Technology, Chinese Academy of Sciences, Mingzhen Li Institute of Computing Technology, Chinese Academy of Sciences, Guangming Tan Chinese Academy of Sciences(CAS), Weile Jia Institute of Computing Technology, Chinese Academy of Sciences
17:40 20m Talk		COMPSO: Optimizing Gradient Compression for Distributed Training with Second-Order Optimizers Main Conference Baixi Sun Indiana University Bloomington, Weijin Liu Stevens Institute of Technology, J. Gregory Pauloski University of Chicago, Jiannan Tian Indiana University, Jinda Jia Indiana University, Daoce Wang Indiana University, Boyuan Zhang Indiana University, Mingkai Zheng Department of Electrical and Computer Engineering at Rutgers University, Sheng Di Argonne National Laboratory, Sian Jin Temple University, Zhao Zhang , Xiaodong Yu Stevens Institute of Technology, Kamil A. Iskra Argonne National Laboratory, Pete Beckman Northwestern University and Argonne National Laboratory, Guangming Tan Chinese Academy of Sciences(CAS), Dingwen Tao Institute of Computing Technology, Chinese Academy of Sciences

18:00 - 19:00	PPoPP Business MeetingMain Conference at Acacia D

18:00 60m Meeting		PPoPP Business Meeting Main Conference

Tue 4 Mar
Displayed time zone: Pacific Time (US & Canada) change

08:30 - 09:30	KeynoteKeynotes at Acacia A&B

08:30 60m Keynote		Do We Really Want Correct Compilers? Keynotes K: John Regehr University of Utah

09:30 - 10:00	BreakCatering / Main Conference at Pre-F

09:30 30m Coffee break		Break Catering

10:00 - 11:00	Session 6: Large Language Models (Session Chair: Minjia Zhang)Main Conference at Acacia D

10:00 20m Talk		MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models Main Conference Elias Frantar ISTA, Roberto López Castro Universidade da Coruña, Jiale Chen ISTA, Torsten Hoefler ETH Zurich, Dan Alistarh IST Austria
10:20 20m Talk		WeiPipe: Weight Pipeline Parallelism for Communication-Effective Long-Context Large Model Training Main Conference Junfeng Lin Tsinghua University, Ziming Liu National University of Singapore, Yang You National University of Singapore, Jun Wang CETHIK Group Co. Ltd., Weihao Zhang Lynxi Technologies Co. Ltd, Rong Zhao Tsinghua University
10:40 20m Talk		ATTNChecker: Highly-Optimized Fault Tolerant Attention for Large Language Model Training Main Conference Yuhang Liang University of Oregon, Xinyi Li Pacific Northwest National Laboratory(PNNL), Jie Ren William & Mary, Ang Li Pacific Northwest National Laboratory, Bo Fang Pacific Northwest National Laboratory(PNNL), Jieyang Chen University of Oregon

11:00 - 11:20	BreakCatering / Main Conference at Pre-F

11:00 20m Coffee break		Break Catering

11:20 - 12:20	Session 7: Scheduling and Resource Management (Session Chair: Jie Ren)Main Conference at Acacia D

11:20 20m Talk		SGDRC: Software-Defined Dynamic Resource Control for Concurrent DNN Inference on NVIDIA GPUs Main Conference Yongkang Zhang HKUST, Haoxuan Yu HKUST, Chenxia Han CUHK, Cheng Wang Alibaba Group, Baotong Lu Microsoft Research, Yunzhe Li Shanghai Jiao Tong University, Zhifeng Jiang HKUST, Yang Li China University of Geosciences, Xiaowen Chu Data Science and Analytics Thrust, HKUST(GZ), Huaicheng Li Virginia Tech
11:40 20m Talk		DORADD: Deterministic Parallel Execution in the Era of Microsecond-Scale Computing Main Conference Scofield Liu Imperial College London, Musa Unal EPFL, Matthew J. Parkinson Microsoft Azure Research, Marios Kogias Imperial College London; Microsoft Research
12:00 20m Talk		WaterWise: Co-optimizing Carbon- and Water-Footprint Toward Environmentally Sustainable Cloud Computing Main Conference Yankai Jiang Northeastern University, Rohan Basu Roy Northeastern University, Raghavendra Kanakagiri Indian Institute of Technology Tirupati, Devesh Tiwari Northeastern University

12:20 - 12:30	Awards AnnouncementMain Conference at Acacia D

12:20 10m Awards		Awards Announcement Main Conference

12:20 - 14:00	LunchCatering / Main Conference at Bristlecone & Cypress, Mesquite 1-5

12:20 1h40m Lunch		Lunch Catering

14:00 - 15:20	Session 8: Tensor Cores (Session Chair: Jeffrey Vetter)Main Conference at Acacia D

14:00 20m Talk		FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores Main Conference Jinliang Shi Beijing University of Posts and Telecommunications, Shigang Li Beijing University of Posts and Telecommunications, Youxuan Xu Beijing University of Posts and Telecommunications, Rongtian Fu Beijing University of Posts and Telecommunications, Xueying Wang Beijing University of Posts and Telecommunications, Tong Wu Beijing University of Posts and Telecommunications
14:20 20m Talk		Acc-SpMM: Accelerating General-purpose Sparse Matrix-Matrix Multiplication with GPU Tensor Cores Main Conference Haisha Zhao Computer Network Information Center, Chinese Academy of Sciences,University of Chinese Academy of Sciences, Li San Computer Network Information Center, Chinese Academy of Sciences,University of Chinese Academy of Sciences, Jiaheng Wang Renmin University of China, Chunbao Zhou Computer Network Information Center, Chinese Academy of Sciences, Jue Wang Computer Network Information Center, Chinese Academy of Sciences, Zhikuang Xin Computer Network Information Center, Chinese Academy of Sciences,University of Chinese Academy of Sciences, lishunde Computer Network Information Center, Chinese Academy of Sciences,University of Chinese Academy of Sciences, ZhiQiang Liang Computer Network Information Center, Chinese Academy of Sciences, Zhijie Pan Hangzhou Dianzi University, Fang Liu Computer Network Information Center, Chinese Academy of Sciences,University of Chinese Academy of Sciences, Yan Zeng Hangzhou Dianzi University, Yangang Wang Computer Network Information Center, Chinese Academy of Sciences, Xuebin Chi Computer Network Information Center, Chinese Academy of Sciences; University of Chinese Academy of Sciences
14:40 20m Talk		BerryBees: Breadth First Search by Bit-Tensor-CoresDistinguished Paper AwardBest Artifact Award Main Conference Yuyao Niu Barcelona Supercomputing Center (BSC) - Universitat Politècnica de Catalunya (UPC), Marc Casas Barcelona Supercomputing Center
15:00 20m Talk		FlashFFTStencil: Bridging Fast Fourier Transforms to Memory-Efficient Stencil Computations on Tensor Core Units Main Conference Haozhi Han Microsoft Research; Peking University, Kun Li Microsoft Research, Wei Cui Microsoft Research, Donglin Bai Microsoft Research, Yiwei Zhang UCAS; Microsoft Research, Liang Yuan Chinese Academy of Sciences, Yifeng Cheng Peking University, Yunquan Zhang Zhang, Ting Cao Microsoft Research, Mao Yang Microsoft Research

15:20 - 15:40	BreakCatering / Main Conference at Pre-F

15:20 20m Coffee break		Break Catering

15:40 - 17:00	Session 9: Concurrent Data Structures and Synchronization II (Session Chair: Pengfei Su)Main Conference at Acacia D

15:40 20m Talk		PANNS: Enhancing Graph-based Approximate Nearest Neighbor Search through Recency-aware Construction and Parameterized Search Main Conference Xizhe Yin University of California, Riverside, Chao Gao University of California Riverside, Zhijia Zhao University of California at Riverside, Rajiv Gupta University of California at Riverside (UCR)
16:00 20m Talk		Balanced Allocations over Efficient Queues: A Fast Relaxed FIFO Queue Main Conference Kåre von Geijer Chalmers University of Technology, Philippas Tsigas Chalmers University of Technology, Elias Johansson Chalmers University of Technology, Sebastian Hermansson Chalmers University of Technology
16:20 20m Talk		LibRTS: A Spatial Indexing Library by Ray Tracing Main Conference Liang Geng The Ohio State University, USA, Rubao Lee , Xiaodong Zhang The Ohio State University
16:40 20m Talk		Crystality: A Programming Model for Smart Contracts on Parallel EVMs Main Conference Hao Wang International Digital Economy Academy (IDEA), Shenzhen, China; and Fullnodes Labs, Minghao Pan International Digital Economy Academy (IDEA), Shenzhen, China; and Fullnodes Labs, Jiaping Wang International Digital Economy Academy (IDEA), Shenzhen, China; and Fullnodes Labs

17:00 - 20:00	ExcursionMain Conference

17:00 3h Social Event		Excursion Main Conference

Wed 5 Mar
Displayed time zone: Pacific Time (US & Canada) change

07:30 - 08:30	BreakfastCatering / Main Conference at Bristlecone & Cypress, Mesquite 1-5

07:30 60m Break		Breakfast Catering

08:30 - 09:30	KeynoteKeynotes at Acacia A&B

08:30 60m Keynote		ML Engineering in the Giant Model Age Keynotes

09:30 - 10:00	BreakCatering / Main Conference at Pre-F

09:30 30m Coffee break		Break Catering

10:00 - 11:20	Session 10: GPU II (Session Chair: Zhijia Zhao)Main Conference at Acacia D

10:00 20m Talk		Popcorn: Accelerating Kernel K-means on GPUs through Sparse Linear Algebra Main Conference Julian Bellavita Cornell University, Thomas Pasquali University of Trento, Laura Del Rio University of Trento, Flavio Vella Free University of Bozen, Giulia Guidi Cornell University
10:20 20m Talk		Swift Unfolding of Communities: GPU-Accelerated Louvain Algorithm Main Conference Zhibin Wang Nanjing University, Xi Lin Nanjing University, Xue Li Alibaba Group, Pinhuan Wang Rutgers, The State University of New Jersey, Ziheng Meng Nanjing University, Hang Liu Rutgers, The State University of New Jersey, Chen Tian Nanjing University, Sheng Zhong Nanjing University
10:40 20m Talk		GLUMIN: Fast Connectivity Check Based on LUTs For Efficient Graph Pattern Mining Main Conference Weichen Cao Institute of Computing Technology, Chinese Academy of Sciences, Ke Meng Chinese Academy of Sciences, linzhiheng Institute of Computing Technology, Chinese Academy of Sciences, Guangming Tan Chinese Academy of Sciences(CAS)
11:00 20m Talk		Improving Tridiagonalization Performance on GPU Architectures Main Conference WangHansheng University of Electronic Science and Technology of China, Zhekai Duan University of Edinburgh, Zitian Zhao University of Electronic Science and Technology of China, Siqi Wu University of Electronic Science and Technology of China, Saiqi Zheng Xi'an Jiaotong-Liverpool University, Qiao Li University of Electronic Science and Technology of China, Xu Jiang University of Electronic Science and Technology of China, Shaoshuai Zhang

11:20 - 11:40	BreakCatering / Main Conference at Pre-F

11:20 20m Coffee break		Break Catering

11:40 - 13:00	Session 11: Parallel Algorithms and Applications (Session Chair: Weicong Chen)Main Conference at Acacia D

11:40 20m Talk		Jigsaw: Toward Conflict-free Vectorized Stencil Computation by Tessellating Swizzled Registers Main Conference Yiwei Zhang UCAS; Microsoft Research, Kun Li Microsoft Research, Liang Yuan Chinese Academy of Sciences, Haozhi Han Microsoft Research; Peking University, Yunquan Zhang Zhang, Ting Cao Microsoft Research, Mao Yang Microsoft Research
12:00 20m Talk		Semi-StructMG: A Fast and Scalable Semi-Structured Algebraic Multigrid Main Conference Yi Zong Tsinghua University, Chensong Zhang Academy of Mathematics and Systems Science, Longjiang Mu Laoshan Laboratory, Jianchun Wang China Ship Scientific Research Center, Jian Sun CMA Earth System Modeling and Prediction Center, Xiaowen Xu Institute of Applied Physics and Computational Mathematics, Xinliang Wang Huawei Technologies Co., Ltd, Peinan Yu Tsinghua University, Wei Xue Tsinghua University
12:20 20m Talk		SBMGT: Scaling Bayesian Multinomial Group Testing Main Conference Weicong Chen University of California, Merced, Hao Qi University of California, Merced, Curtis Tatsuoka University of Pittsburgh, Xiaoyi Lu UC Merced
12:40 20m Talk		An AI-Enhanced 1km-Resolution Seamless Global Weather and Climate Model to Achieve Year-Scale Simulation Speed using 34 Million Cores Main Conference Xiaohui Duan Shandong University, Yi Zhang PIESAT Information Technology，Co. Ltd., Kai Xu Laoshan Laboratory, Haohuan Fu Tsinghua University, Bin Yang Tianjin University, Yiming Wang PIESAT Information Technology，Co. Ltd., Yilun Han Tsinghua University, Siyuan Chen PIESAT Information Technology，Co. Ltd., Zhuangzhuang Zhou National Supercomputing Center in Wuxi, Chenyu Wang National Supercomputing Center in Wuxi, Dongqiang Huang National Supercomputing Center in Wuxi, Huihai An Shandong University, Xiting Ju Tsinghua University, Haopeng Huang Tsinghua University, Zhuang Liu Tsinghua University, Wei Xue Tsinghua, Weiguo Liu Shandong University, Bowen Yan Tsinghua University, Jianye Hou The Chinese University of Hong Kong, Maoxue Yu Laoshan Laboratory, Wenguang Chen Tsinghua University; Pengcheng Laboratory, Jian Li Chinese Academy of Meteorological Sciences, Zhao Jing Laoshan Laboratory, Hailong Liu Laoshan Laboratory, Lixin Wu Laoshan Laboratory

13:00 - 13:10	Closing remarksMain Conference at Acacia D

13:00 10m Day closing		Closing remarks Main Conference

Accepted Papers

	Title
	AC-Cache: A Memory-Efficient Caching System for Small Objects via Exploiting Access Correlations Main Conference Fulin Nan, Zhirong Shen
	Accelerating GNNs on GPU Sparse Tensor Cores through N:M Sparsity-Oriented Graph Reordering Main Conference Jou-An Chen, Hsin-Hsuan Sung, Ruifeng Zhang, Ang Li, Xipeng Shen
	Acc-SpMM: Accelerating General-purpose Sparse Matrix-Matrix Multiplication with GPU Tensor Cores Main Conference Haisha Zhao, Li San, Jiaheng Wang, Chunbao Zhou, Jue Wang, Zhikuang Xin, lishunde , ZhiQiang Liang, Zhijie Pan, Fang Liu, Yan Zeng, Yangang Wang, Xuebin Chi
	Adaptive Parallel Training for Graph Neural Networks Main Conference Kaihao Ma, Renjie Liu, Xiao Yan, Zhenkun Cai, Xiang Song, Minjie Wang, Yichao Li, James Cheng
	Aggregating Funnels for Faster Fetch&Add and Queues Main Conference Younghun Roh, Yuanhao Wei, Eric Ruppert, Panagiota Fatourou, Siddhartha Jayanti, Julian Shun
	An AI-Enhanced 1km-Resolution Seamless Global Weather and Climate Model to Achieve Year-Scale Simulation Speed using 34 Million Cores Main Conference Xiaohui Duan, Yi Zhang, Kai Xu, Haohuan Fu, Bin Yang, Yiming Wang, Yilun Han, Siyuan Chen, Zhuangzhuang Zhou, Chenyu Wang, Dongqiang Huang, Huihai An, Xiting Ju, Haopeng Huang, Zhuang Liu, Wei Xue, Weiguo Liu, Bowen Yan, Jianye Hou, Maoxue Yu, Wenguang Chen, Jian Li, Zhao Jing, Hailong Liu, Lixin Wu
	ATTNChecker: Highly-Optimized Fault Tolerant Attention for Large Language Model Training Main Conference Yuhang Liang, Xinyi Li, Jie Ren, Ang Li, Bo Fang, Jieyang Chen
	Balanced Allocations over Efficient Queues: A Fast Relaxed FIFO Queue Main Conference Kåre von Geijer, Philippas Tsigas, Elias Johansson, Sebastian Hermansson
	BerryBees: Breadth First Search by Bit-Tensor-CoresDistinguished Paper AwardBest Artifact Award Main Conference Yuyao Niu, Marc Casas
	COMPSO: Optimizing Gradient Compression for Distributed Training with Second-Order Optimizers Main Conference Baixi Sun, Weijin Liu, J. Gregory Pauloski, Jiannan Tian, Jinda Jia, Daoce Wang, Boyuan Zhang, Mingkai Zheng, Sheng Di, Sian Jin, Zhao Zhang, Xiaodong Yu, Kamil A. Iskra, Pete Beckman, Guangming Tan, Dingwen Tao
	Crystality: A Programming Model for Smart Contracts on Parallel EVMs Main Conference Hao Wang, Minghao Pan, Jiaping Wang
	DORADD: Deterministic Parallel Execution in the Era of Microsecond-Scale Computing Main Conference Scofield Liu, Musa Unal, Matthew J. Parkinson, Marios Kogias
	Effectively Virtual Page Prefetching via Spatial-Temporal Patterns for Memory-intensive Cloud Applications Main Conference Yun Wang, Liang Chen, Tianmai Deng, Ben Luo, Yibin Shen, Zhixiang Wei, Yixiao Xu, Minglang Huang, Zhengwei Qi
	EVeREST: An Effective and Versatile Runtime Energy Saving Tool for GPUsDistinguished Paper Award Main Conference Anna Yue, Pen-Chung Yew, Sanyam Mehta
	Fairer and More Scalable Reader-Writer Locks by Optimizing Queue Management Main Conference Takashi Hoshino, Kenjiro Taura
	FlashFFTStencil: Bridging Fast Fourier Transforms to Memory-Efficient Stencil Computations on Tensor Core Units Main Conference Haozhi Han, Kun Li, Wei Cui, Donglin Bai, Yiwei Zhang, Liang Yuan, Yifeng Cheng, Yunquan Zhang, Ting Cao, Mao Yang
	FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores Main Conference Jinliang Shi, Shigang Li, Youxuan Xu, Rongtian Fu, Xueying Wang, Tong Wu
	FlashTensor: Optimizing Tensor Programs by Leveraging Fine-grained Tensor Property Main Conference Runxin Zhong, Yuyang Jin, Chen Zhang, Kinman Lei, Shuangyu Li, Jidong Zhai
	GLUMIN: Fast Connectivity Check Based on LUTs For Efficient Graph Pattern Mining Main Conference Weichen Cao, Ke Meng, linzhiheng , Guangming Tan
	Harnessing Inter-GPU Shared Memory for Seamless MoE Communication-Computation Fusion Main Conference Hulin Wang, Yaqi Xia, Donglin Yang, Xiaobo Zhou, Dazhao Cheng
	Helios: Efficient Distributed Dynamic Graph Sampling for Online GNN Inference Main Conference Jie Sun, Zuocheng Shi, Li Su, Wenting Shen, Zeke Wang, Yong Li, Wenyuan Yu, Wei Lin, Fei Wu, Jingren Zhou, Bingsheng He
	Improving Tridiagonalization Performance on GPU Architectures Main Conference WangHansheng , Zhekai Duan, Zitian Zhao, Siqi Wu, Saiqi Zheng, Qiao Li, Xu Jiang, Shaoshuai Zhang
	Jigsaw: Toward Conflict-free Vectorized Stencil Computation by Tessellating Swizzled Registers Main Conference Yiwei Zhang, Kun Li, Liang Yuan, Haozhi Han, Yunquan Zhang, Ting Cao, Mao Yang
	LibRTS: A Spatial Indexing Library by Ray Tracing Main Conference Liang Geng, Rubao Lee, Xiaodong Zhang
	Mario: Near Zero-cost Activation Checkpointing in Pipeline Parallelism Main Conference Weijian Liu, Mingzhen Li, Guangming Tan, Weile Jia
	MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models Main Conference Elias Frantar, Roberto López Castro, Jiale Chen, Torsten Hoefler, Dan Alistarh
	PANNS: Enhancing Graph-based Approximate Nearest Neighbor Search through Recency-aware Construction and Parameterized Search Main Conference Xizhe Yin, Chao Gao, Zhijia Zhao, Rajiv Gupta
	Popcorn: Accelerating Kernel K-means on GPUs through Sparse Linear Algebra Main Conference Julian Bellavita, Thomas Pasquali, Laura Del Rio, Flavio Vella, Giulia Guidi
	POSTER: A General and Scalable GCN Training Framework on CPU Supercomputers Main Conference Chen Zhuang, Peng Chen, Xin Liu, Rio Yokota, Nikoli Dryden, Toshio Endo, Satoshi Matsuoka, Mohamed Wahib
	POSTER: Big Atomics and Fast Concurrent Hash Tables Main Conference Daniel Anderson, Guy E. Blelloch, Siddhartha Jayanti
	POSTER: Boost Lock-free Queue and Stack with Batching Main Conference Ao Li, Wenhai Li, Yuan Chen, Lingfeng Deng
	POSTER: FastBWA: Practical and Cost-Efficient Genome Sequence Alignment Pipeline Main Conference Zhonghai Zhang, Yewen Li, Ke Meng, Chunming Zhang, Guangming Tan
	POSTER: Frontier-guided Graph Reordering Main Conference Xinmiao Zhang, Cheng Liu, Shengwen Liang, Chenwei Xiong, Yu Zhang, Lei Zhang, Huawei Li, Xiaowei Li
	POSTER: High-performance Visual Semantics Compression for AI-Driven Science Main Conference Boyuan Zhang, Luanzheng Guo, Jiannan Tian, Jinyang Liu, Daoce Wang, Fanjiang Ye, Chengming Zhang, Jan Strube, Nathan R. Tallent, Dingwen Tao
	POSTER: Magneto: Accelerating Parallel Structures in DNNs via Co-Optimization of Operators Main Conference Zhanyuan Di, Leping Wang, Ziyi Ren, En Shao, Jie Zhao, Siyuan Feng, Dingwen Tao, Guangming Tan, Ninghui Sun
	POSTER: Minimizing speculation overhead in a parallel recognizer for regular texts Main Conference Angelo Borsotti, Luca Breveglieri, Stefano Crespi Reghizzi, Angelo Morzenti
	POSTER: TENSORMD: Molecular Dynamics Simulation with Ab Initio Accuracy of 50 Billion Atoms Main Conference Yucheng Ouyang, Xin Chen, Ying Liu, Xin Chen, Honghui Shang, Zhenchuan Chen, Rongfen Lin, Xingyu Gao, Lifang Wang, Fang Li, Jiahao Shan, Haifeng Song, Huimin Cui, Xiaobing Feng
	POSTER: Transactional Data Structures with Orthogonal Metadata Main Conference Yaodong Sheng, Ahmed Hassan, Michael Spear
	POSTER: Triangle Counting on Tensor Cores Main Conference YuAng Chen, Jeffrey Xu Yu
	Publish on Ping: A Better Way to Publish Reservations in Memory Reclamation for Concurrent Data StructuresDistinguished Paper Award Main Conference Ajay Singh, Trevor Brown
	Reciprocating Locks Main Conference Dave Dice, Alex Kogan
	RT–BarnesHut: Accelerating Barnes–Hut Using Ray-Tracing Hardware Main Conference Vani Nagarajan, Rohan Gangaraju, Kirshanthan Sundararajah, Artem Pelenitsyn, Milind Kulkarni
	SBMGT: Scaling Bayesian Multinomial Group Testing Main Conference Weicong Chen, Hao Qi, Curtis Tatsuoka, Xiaoyi Lu
	Semi-StructMG: A Fast and Scalable Semi-Structured Algebraic Multigrid Main Conference Yi Zong, Chensong Zhang, Longjiang Mu, Jianchun Wang, Jian Sun, Xiaowen Xu, Xinliang Wang, Peinan Yu, Wei Xue
	SGDRC: Software-Defined Dynamic Resource Control for Concurrent DNN Inference on NVIDIA GPUs Main Conference Yongkang Zhang, Haoxuan Yu, Chenxia Han, Cheng Wang, Baotong Lu, Yunzhe Li, Zhifeng Jiang, Yang Li, Xiaowen Chu, Huaicheng Li
	Swift Unfolding of Communities: GPU-Accelerated Louvain Algorithm Main Conference Zhibin Wang, Xi Lin, Xue Li, Pinhuan Wang, Ziheng Meng, Hang Liu, Chen Tian, Sheng Zhong
	TurboFFT: Co-Designed High-Performance and Fault-Tolerant Fast Fourier Transform on GPUs Main Conference Shixun Wu, Yujia Zhai, Jinyang Liu, Jiajun Huang, Zizhe Jian, Huangliang Dai, Sheng Di, Franck Cappello, zizhong chen
	WaterWise: Co-optimizing Carbon- and Water-Footprint Toward Environmentally Sustainable Cloud Computing Main Conference Yankai Jiang, Rohan Basu Roy, Raghavendra Kanakagiri, Devesh Tiwari
	WeiPipe: Weight Pipeline Parallelism for Communication-Effective Long-Context Large Model Training Main Conference Junfeng Lin, Ziming Liu, Yang You, Jun Wang, Weihao Zhang, Rong Zhao

Call for Papers

PPoPP 2025: 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming

Location: Las Vegas, Nevada, USA (collocated with CC 2025, CGO 2025, and HPCA 2025). Dates: 01 March – 05 March, 2025.

Submission URL: https://ppopp25.hotcrp.com

Important dates:

Full paper submission: Friday, August 16, 2024
Author response period: Wednesday, October 23 – Friday, October 25, 2024
Author notification: Monday, November 11, 2024
Artifact submission to AE committee: Monday, November 18, 2024
Artifact notification by AE committee: Monday, January 6, 2025
Final paper due: Friday, January 10, 2025

Scope:

PPoPP is the premier forum for leading work on all aspects of parallel programming, including theoretical foundations, techniques, languages, compilers, runtime systems, tools, and practical experience. In the context of the symposium, “parallel programming” encompasses work on concurrent and parallel systems (multicore, multi-threaded, heterogeneous, clustered, and distributed systems, grids, accelerators such as ASICs, GPUs, FPGAs, data centers, clouds, large scale machines, and quantum computers). PPoPP is interested in all aspects related to improving the productivity of parallel programming on modern architectures. PPoPP is also interested in work that addresses new parallel workloads and issues that arise out of large-scale scientific or enterprise workloads.

Specific topics of interest include (but are not limited to):

Languages, compilers, and runtime systems for parallel programs
Concurrent data structures
Development, analysis, or management tools
Fault tolerance for parallel systems
Formal analysis and verification
High-performance libraries
Middleware for parallel systems
Machine learning for parallel systems
Parallel algorithms
Parallel applications including scientific computing (e.g., simulation and modeling) and enterprise workloads (e.g., web, search, analytics, cloud, and machine learning)
Parallel frameworks
Parallel programming for deep memory hierarchies including nonvolatile memory
Parallel programming theory and models
Performance analysis, debugging and optimization
Productivity tools for parallel systems
Software engineering for parallel programs
Synchronization and concurrency control

Papers should report on original research relevant to parallel programming and should contain enough background materials to make them accessible to the entire parallel programming research community. Papers describing experience should indicate how they illustrate general principles or lead to new insights; papers about parallel programming foundations should indicate how they relate to practice. PPoPP submissions will be evaluated based on their technical merit and accessibility. Submissions should clearly motivate the importance of the problem being addressed, compare to the existing body of work on the topic, and explicitly and precisely state the paper’s key contributions and results towards addressing the problem. Submissions should strive to be accessible both to a broad audience and to experts in the area.

Paper Submission:

Conference submission site: https://ppopp25.hotcrp.com

All submissions must be made electronically through the conference website and include an abstract (100–400 words), author contact information, the full list of authors and their affiliations. Full paper submissions must be in PDF format printable on both A4 and US letter-size paper.

All papers must be prepared in ACM Conference Format using the 2-column acmart format: use the SIGPLAN proceedings template acmart-sigplanproc-template.tex for Latex, and interim-layout.docx for Word. You may also want to consult the official ACM information on the Master Article Template and related tools. Important note: The Word template (interim-layout.docx) on the ACM website uses 9pt font; you need to increase it to 10pt.

Papers should contain a maximum of 10 pages of text (in a typeface no smaller than 10pt) or figures, NOT INCLUDING references. There is no page limit for references, and they must include the name of all authors (not {et al.}). Appendices are not allowed, but the authors may submit supplementary material, such as proofs or source code; all supplementary material must be in PDF or ZIP format. Looking at supplementary material is at the discretion of the reviewers.

Submission is double-blind and authors will need to identify any potential conflicts of interest with PC and Extended Review Committee members, as defined here: http://www.sigplan.org/Resources/Policies/Review/ (ACM SIGPLAN policy).

PPoPP 2025 will employ a double-blind reviewing process. To facilitate this process, submissions should not reveal the identity of the authors in any way. Authors should leave out author names and affiliations from the body of their submission. They should also ensure that any references to authors’ own related work should be in the third person (e.g., not “We build on our previous work …” but rather “We build on the work of …”). The purpose of this process is to help the PC and external reviewers come to an initial judgment about the paper without bias, not to make it impossible for them to discover the authors if they were to try. Nothing should be done in the name of anonymity that weakens the submission or makes the job of reviewing the paper more difficult. In particular, important background references should not be omitted or anonymized. In addition, authors should feel free to disseminate their ideas or draft versions of their papers as they normally would. For instance, authors may post drafts of their papers on the web or give talks on their research ideas. Authors with further questions on double-blind reviewing are encouraged to contact the Program Chairs by email.

To facilitate fair and unbiased reviews for all submissions, PPoPP 2025 may utilize the Toronto Paper Matching System (TPMS) to assign papers to reviewers. From the authors’ perspective, this decision means that the submissions may be uploaded to the TPMS.

Submissions should be in PDF and printable on both US Letter and A4 paper. Papers may be resubmitted to the submission site multiple times up until the deadline, but the last version submitted before the deadline will be the version reviewed. Papers that exceed the length requirement, which deviate from the expected format, or that are submitted late will be rejected.

All submissions that are not accepted for regular presentations will be automatically considered for posters. Two-page summaries of accepted posters will be included in the conference proceedings.

To allow reproducibility, we encourage authors of accepted papers to submit their papers for Artifact Evaluation (AE). The AE process begins after the acceptance notification and is run by a separate committee whose task is to assess how the artifacts support the work described in the papers. Artifact evaluation is voluntary and will not affect paper acceptance but will be taken into consideration when selecting papers for awards. Papers that go through the AE process successfully will receive at least one ACM reproducibility badge, printed on the papers themselves. More information will be posted on the AE website.

Deadlines expire at midnight anywhere on earth.

Publication Date:

The titles of all accepted papers are typically announced shortly after the author notification date (late November 2024). Note, however, that this is not the official publication date. The official publication date is the date the proceedings are made available in the ACM Digital Library. ACM will make the proceedings available via the Digital Library, up to 2 weeks prior to the first day of the conference. The official publication date affects the deadline for any patent filings related to published work.

ACM Publications Policies:

By submitting your article to an ACM Publication, you are hereby acknowledging that you and your co-authors are subject to all ACM Publications Policies, including ACM’s new Publications Policy on Research Involving Human Participants and Subjects. Alleged violations of this policy or any ACM Publications Policy will be investigated by ACM and may result in a full retraction of your paper, in addition to other potential penalties, as per ACM Publications Policy." https://www.acm.org/publications/policies/research-involving-human-participants-and-subjects

Please ensure that you and your co-authors obtain an ORCID ID, so you can complete the publishing process for your accepted paper. We are committed to improving author discoverability, ensure proper attribution and contribute to ongoing community efforts around name normalization; your ORCID ID will help in these efforts. Please follow the https://dl.acm.org/journal/pacmcgit/author-guidelines link to see ACM’s ORCID requirements for authors.

Main ConferencePPoPP 2025

Program Display Configuration

Sat 1 MarDisplayed time zone: Pacific Time (US & Canada) change

Sun 2 MarDisplayed time zone: Pacific Time (US & Canada) change

Mon 3 MarDisplayed time zone: Pacific Time (US & Canada) change

Tue 4 MarDisplayed time zone: Pacific Time (US & Canada) change

Wed 5 MarDisplayed time zone: Pacific Time (US & Canada) change

Accepted Papers

Call for Papers

Important dates:

Scope:

Paper Submission:

Publication Date:

ACM Publications Policies:

Rezaul ChowdhuryProgram Co-Chair

Stony Brook University

United States

Bin RenProgram Co-Chair

William & Mary

United States

Gagan AgrawalPC Member

University of Georgia

United States

Tekin BicerPC Member

Argonne National Laboratory

United States

Quan ChenPC Member

Shanghai Jiao Tong University

China

Yue ChengPC Member

University of Virginia

United States

Kazem CheshmiPC Member

McMaster University

Canada

Florina CiorbaPC Member

University of Basel

Switzerland

Guojing CongPC Member

Oak Ridge National Laboratory

United States

Huimin CuiPC Member

Institute of Computing Technology, Chinese Academy of Sciences

China

Rathish DasPC Member

University of Houston

United States

Roshan DathathriPC Member

Microsoft Research

United States

Frank DehnePC Member

Carleton University, Ottawa, Canada

Canada

Chen DingPC Member

University of Rochester

United States

Basilio FraguelaPC Member

Universidade da Coruña

Spain

Thomas GrossPC Member

ETH Zurich

Switzerland

Peng JiangPC Member

The University of Iowa

United States

Shoaib KamilPC Member

Adobe Research

United States

Gokcen KestorPC Member

Pacific Northwest National Laboratory

United States

Dong LiPC Member

University of California, Merced

United States

Jiajia LiPC Member

North Carolina State University

United States

Jing LiPC Member

New Jersey Institute of Technology

United States

Sat 1 Mar
Displayed time zone: Pacific Time (US & Canada) change

Sun 2 Mar
Displayed time zone: Pacific Time (US & Canada) change

Mon 3 Mar
Displayed time zone: Pacific Time (US & Canada) change

Tue 4 Mar
Displayed time zone: Pacific Time (US & Canada) change

Wed 5 Mar
Displayed time zone: Pacific Time (US & Canada) change