Several strategies are proposed to parallelize graph neural network (GNN) training over multiple GPUs. These strategies are logically equivalent (i.e., produce the same model) but partition data and assign computation to the GPUs in different ways. We observe that there is no consistent winner (i.e., with the shortest running time), and the optimal strategy depends on the graph dataset, GNN model, training algorithm, and hardware configurations. As such, we design the APT system to automatically select efficient parallelization strategies for GNN training tasks. In particular, we analyze the trade-offs of the strategies and design simple yet effective cost models to compare their execution time and facilitate strategy selection. Moreover, we also propose a general abstraction of the strategies, which allows to implement a unified execution engine that can be configured to run different strategies. Our experiments show that APT runs the optimal strategy across various task configurations, and the training time can be reduced by over 2x compared with always using a single strategy. APT is open-source anonymously at https://anonymous.4open.science/r/APT-1CAB.

Mon 3 Mar

Displayed time zone: Pacific Time (US & Canada) change

10:00 - 11:00
Session 1: Graph Neural Networks (Session Chair: Miao Yin)Main Conference at Acacia D
10:00
20m
Talk
Helios: Efficient Distributed Dynamic Graph Sampling for Online GNN Inference
Main Conference
Jie Sun Zhejiang University, Zuocheng Shi Zhejiang University, Li Su Alibaba Group, Wenting Shen Alibaba Group, Zeke Wang Zhejiang University, Yong Li Alibaba Group, Wenyuan Yu Alibaba Group, Wei Lin Alibaba Group, Fei Wu College of Computer Science and Technology in Zhejiang University, Jingren Zhou Alibaba Group, Bingsheng He National University of Singapore
10:20
20m
Talk
Accelerating GNNs on GPU Sparse Tensor Cores through N:M Sparsity-Oriented Graph Reordering
Main Conference
Jou-An Chen North Carolina State University, Hsin-Hsuan Sung North Carolina State University, Ruifeng Zhang North Carolina State University, Ang Li Pacific Northwest National Laboratory, Xipeng Shen North Carolina State University
10:40
20m
Talk
Adaptive Parallel Training for Graph Neural Networks
Main Conference
Kaihao Ma The Chinese University of Hong Kong, Renjie Liu Southern University of Science and Technology, Xiao Yan Centre for Perceptual and Interactive Intelligence (CPII), Zhenkun Cai Amazon, Xiang Song Amazon Web Services, Minjie Wang Amazon Web Services, Yichao Li The Chinese University of Hong Kong, James Cheng The Chinese University of Hong Kong