Adaptive Parallel Training for Graph Neural Networks (PPoPP 2025 - Main Conference)

Who

Kaihao Ma, Renjie Liu, Xiao Yan, Zhenkun Cai, Xiang Song, Minjie Wang, Yichao Li, James Cheng

Track

PPoPP 2025 Main Conference

Time Zone

The program is currently displayed in (GMT-08:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-08:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 3 Mar 2025 10:40 - 11:00 at Acacia D - Session 1: Graph Neural Networks (Session Chair: Miao Yin)

Abstract

Several strategies are proposed to parallelize graph neural network (GNN) training over multiple GPUs. These strategies are logically equivalent (i.e., produce the same model) but partition data and assign computation to the GPUs in different ways. We observe that there is no consistent winner (i.e., with the shortest running time), and the optimal strategy depends on the graph dataset, GNN model, training algorithm, and hardware configurations. As such, we design the APT system to automatically select efficient parallelization strategies for GNN training tasks. In particular, we analyze the trade-offs of the strategies and design simple yet effective cost models to compare their execution time and facilitate strategy selection. Moreover, we also propose a general abstraction of the strategies, which allows to implement a unified execution engine that can be configured to run different strategies. Our experiments show that APT runs the optimal strategy across various task configurations, and the training time can be reduced by over 2x compared with always using a single strategy. APT is open-source anonymously at https://anonymous.4open.science/r/APT-1CAB.

Kaihao Ma

The Chinese University of Hong Kong

Renjie Liu

Southern University of Science and Technology

Xiao Yan

Centre for Perceptual and Interactive Intelligence (CPII)

Zhenkun Cai

Amazon

Xiang Song

Amazon Web Services

Minjie Wang

Amazon Web Services

Yichao Li

The Chinese University of Hong Kong

James Cheng

The Chinese University of Hong Kong

Time Zone

The program is currently displayed in (GMT-08:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-08:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 3 Mar
Displayed time zone: Pacific Time (US & Canada) change

10:00 - 11:00	Session 1: Graph Neural Networks (Session Chair: Miao Yin)Main Conference at Acacia D

10:00 20m Talk		Helios: Efficient Distributed Dynamic Graph Sampling for Online GNN Inference Main Conference Jie Sun Zhejiang University, Zuocheng Shi Zhejiang University, Li Su Alibaba Group, Wenting Shen Alibaba Group, Zeke Wang Zhejiang University, Yong Li Alibaba Group, Wenyuan Yu Alibaba Group, Wei Lin Alibaba Group, Fei Wu College of Computer Science and Technology in Zhejiang University, Jingren Zhou Alibaba Group, Bingsheng He National University of Singapore
10:20 20m Talk		Accelerating GNNs on GPU Sparse Tensor Cores through N:M Sparsity-Oriented Graph Reordering Main Conference Jou-An Chen North Carolina State University, Hsin-Hsuan Sung North Carolina State University, Ruifeng Zhang North Carolina State University, Ang Li Pacific Northwest National Laboratory, Xipeng Shen North Carolina State University
10:40 20m Talk		Adaptive Parallel Training for Graph Neural Networks Main Conference Kaihao Ma The Chinese University of Hong Kong, Renjie Liu Southern University of Science and Technology, Xiao Yan Centre for Perceptual and Interactive Intelligence (CPII), Zhenkun Cai Amazon, Xiang Song Amazon Web Services, Minjie Wang Amazon Web Services, Yichao Li The Chinese University of Hong Kong, James Cheng The Chinese University of Hong Kong