Swift Unfolding of Communities: GPU-Accelerated Louvain Algorithm
The Louvain algorithm is one of the most popular algorithms for community detection. Observing that existing implementations suffer from inaccurate pruning and inefficient intermediate state management, we introduce GALA, \underline{G}PU-\underline{A}ccelerated \underline{L}ouvain \underline{A}lgorithm, which incorporates two key innovations. The first innovation is a novel modularity gain-based pruning strategy, supported by rigorous theoretical guarantees of optimality and able to reduce up to 76% of vertices as well as their corresponding computations. To take advantage of the memory hierarchy and parallelism of GPUs, the second innovation is workload-aware kernels, featuring a shuffle-based kernel founded on the warp-level primitives for exchange states and a hash-based kernel that prioritizes shared memory in hashtable design. GALA further scales to multiple GPUs by minimizing the synchronization overhead between GPUs through a dense-sparse synchronization strategy. We evaluate the performance of GALA through theoretical analysis and practical experiments on various real-world graphs. The experimental results confirm that GALA significantly improves the performance of the parallel Louvain algorithm on GPUs, surpassing state-of-the-art solutions by $6\times$ on average.
Wed 5 MarDisplayed time zone: Pacific Time (US & Canada) change
10:00 - 11:20 | |||
10:00 20mTalk | Popcorn: Accelerating Kernel K-means on GPUs through Sparse Linear Algebra Main Conference Julian Bellavita Cornell University, Thomas Pasquali University of Trento, Laura Del Rio University of Trento, Flavio Vella Free University of Bozen, Giulia Guidi Cornell University | ||
10:20 20mTalk | Swift Unfolding of Communities: GPU-Accelerated Louvain Algorithm Main Conference Zhibin Wang Nanjing University, Xi Lin Nanjing University, Xue Li Alibaba Group, Pinhuan Wang Rutgers, The State University of New Jersey, Ziheng Meng Nanjing University, Hang Liu Rutgers, The State University of New Jersey, Chen Tian Nanjing University, Sheng Zhong Nanjing University | ||
10:40 20mTalk | GLUMIN: Fast Connectivity Check Based on LUTs For Efficient Graph Pattern Mining Main Conference Weichen Cao Institute of Computing Technology, Chinese Academy of Sciences, Ke Meng Chinese Academy of Sciences, linzhiheng Institute of Computing Technology, Chinese Academy of Sciences, Guangming Tan Chinese Academy of Sciences(CAS) | ||
11:00 20mTalk | Improving Tridiagonalization Performance on GPU Architectures Main Conference WangHansheng University of Electronic Science and Technology of China, Zhekai Duan University of Edinburgh, Zitian Zhao University of Electronic Science and Technology of China, Siqi Wu University of Electronic Science and Technology of China, Saiqi Zheng Xi'an Jiaotong-Liverpool University, Qiao Li University of Electronic Science and Technology of China, Xu Jiang University of Electronic Science and Technology of China, Shaoshuai Zhang |