Tridiagonalization, which is a key step in symmetric eigenvalue decomposition (EVD), aims to convert a symmetric matrix to a tridiagonal form. In Nvidia’s cuSOLVER library, the tridiagonalization process only reach 2.1 TFLOPs out of 67 TFLOPs on H100 GPU, and it consumes a significant portion of the elapsed time in the entire EVD process, accounting for over 97%. Thus, improving the tridiagonalization performance is crucial on accelerating EVD. In this paper, we analyze the reasons behind the suboptimal performance of tridiagonalization on GPU architectures, and we propose a new double blocking band reduction algorithm along with an implementation of GPU-based bulge chasing to improve the tridiagonalization performance. Through experimental evaluation, the proposed tridiagonalization method yields up to 19.6 TFLOPs which is 9.3x and 5.2x faster compared cuSOVLER and MAGMA, respectively.

Wed 5 Mar

Displayed time zone: Pacific Time (US & Canada) change

10:00 - 11:20
Session 10: GPU II (Session Chair: Zhijia Zhao)Main Conference at Acacia D
10:00
20m
Talk
Popcorn: Accelerating Kernel K-means on GPUs through Sparse Linear Algebra
Main Conference
Julian Bellavita Cornell University, Thomas Pasquali University of Trento, Laura Del Rio University of Trento, Flavio Vella Free University of Bozen, Giulia Guidi Cornell University
10:20
20m
Talk
Swift Unfolding of Communities: GPU-Accelerated Louvain Algorithm
Main Conference
Zhibin Wang Nanjing University, Xi Lin Nanjing University, Xue Li Alibaba Group, Pinhuan Wang Rutgers, The State University of New Jersey, Ziheng Meng Nanjing University, Hang Liu Rutgers, The State University of New Jersey, Chen Tian Nanjing University, Sheng Zhong Nanjing University
10:40
20m
Talk
GLUMIN: Fast Connectivity Check Based on LUTs For Efficient Graph Pattern Mining
Main Conference
Weichen Cao Institute of Computing Technology, Chinese Academy of Sciences, Ke Meng Chinese Academy of Sciences, linzhiheng Institute of Computing Technology, Chinese Academy of Sciences, Guangming Tan Chinese Academy of Sciences(CAS)
11:00
20m
Talk
Improving Tridiagonalization Performance on GPU Architectures
Main Conference
WangHansheng University of Electronic Science and Technology of China, Zhekai Duan University of Edinburgh, Zitian Zhao University of Electronic Science and Technology of China, Siqi Wu University of Electronic Science and Technology of China, Saiqi Zheng Xi'an Jiaotong-Liverpool University, Qiao Li University of Electronic Science and Technology of China, Xu Jiang University of Electronic Science and Technology of China, Shaoshuai Zhang