Improving Tridiagonalization Performance on GPU Architectures
Tridiagonalization, which is a key step in symmetric eigenvalue decomposition (EVD), aims to convert a symmetric matrix to a tridiagonal form. In Nvidia’s cuSOLVER library, the tridiagonalization process only reach 2.1 TFLOPs out of 67 TFLOPs on H100 GPU, and it consumes a significant portion of the elapsed time in the entire EVD process, accounting for over 97%. Thus, improving the tridiagonalization performance is crucial on accelerating EVD. In this paper, we analyze the reasons behind the suboptimal performance of tridiagonalization on GPU architectures, and we propose a new double blocking band reduction algorithm along with an implementation of GPU-based bulge chasing to improve the tridiagonalization performance. Through experimental evaluation, the proposed tridiagonalization method yields up to 19.6 TFLOPs which is 9.3x and 5.2x faster compared cuSOVLER and MAGMA, respectively.
Wed 5 MarDisplayed time zone: Pacific Time (US & Canada) change
10:00 - 11:20 | |||
10:00 20mTalk | Popcorn: Accelerating Kernel K-means on GPUs through Sparse Linear Algebra Main Conference Julian Bellavita Cornell University, Thomas Pasquali University of Trento, Laura Del Rio University of Trento, Flavio Vella Free University of Bozen, Giulia Guidi Cornell University | ||
10:20 20mTalk | Swift Unfolding of Communities: GPU-Accelerated Louvain Algorithm Main Conference Zhibin Wang Nanjing University, Xi Lin Nanjing University, Xue Li Alibaba Group, Pinhuan Wang Rutgers, The State University of New Jersey, Ziheng Meng Nanjing University, Hang Liu Rutgers, The State University of New Jersey, Chen Tian Nanjing University, Sheng Zhong Nanjing University | ||
10:40 20mTalk | GLUMIN: Fast Connectivity Check Based on LUTs For Efficient Graph Pattern Mining Main Conference Weichen Cao Institute of Computing Technology, Chinese Academy of Sciences, Ke Meng Chinese Academy of Sciences, linzhiheng Institute of Computing Technology, Chinese Academy of Sciences, Guangming Tan Chinese Academy of Sciences(CAS) | ||
11:00 20mTalk | Improving Tridiagonalization Performance on GPU Architectures Main Conference WangHansheng University of Electronic Science and Technology of China, Zhekai Duan University of Edinburgh, Zitian Zhao University of Electronic Science and Technology of China, Siqi Wu University of Electronic Science and Technology of China, Saiqi Zheng Xi'an Jiaotong-Liverpool University, Qiao Li University of Electronic Science and Technology of China, Xu Jiang University of Electronic Science and Technology of China, Shaoshuai Zhang |