Amid conflicting demands for ever-improving performance and maximizing energy savings, it is important to have a tool that automatically identifies opportunities to save power/energy at runtime without compromising performance. GPUs in particular present challenges due to (1) reduced savings available from memory bound applications, and (2) limited availability of low overhead performance counters. Thus, a successful tool must address these issues while still tackling the challenges of dynamic application characterization, versatility across processors from different vendors, and effectiveness at making the right power-performance tradeoffs for desired energy savings.

We propose Everest, a tool that automatically finds energy saving opportunities across GPUs at runtime. Specifically, Everest finds two unique avenues for saving energy using DVFS in GPUs in addition to the traditional method of lowering core clock for memory bound phases. Everest has very low overhead and works across different GPUs given its reliance on the minimum possible performance events for the needed characterization. Everest works at a finer granularity of individual application phases and utilizes in-built performance estimation to provide desired performance guarantees for an effective solution that outperforms existing solutions on the latest NVIDIA and AMD GPUs.

Mon 3 Mar

Displayed time zone: Pacific Time (US & Canada) change

11:20 - 12:20
Session 2: GPU I ​(Session Chair: Xipeng Shen)Main Conference at Acacia D
11:20
20m
Talk
RT–BarnesHut: Accelerating Barnes–Hut Using Ray-Tracing Hardware
Main Conference
Vani Nagarajan Purdue University, Rohan Gangaraju Purdue University, Kirshanthan Sundararajah Virginia Tech, Artem Pelenitsyn Purdue University, Milind Kulkarni Purdue University
11:40
20m
Talk
EVeREST: An Effective and Versatile Runtime Energy Saving Tool for GPUsDistinguished Paper Award
Main Conference
Anna Yue University of Minnesota at Twin Cities, Pen-Chung Yew University of Minnesota at Twin Cities, Sanyam Mehta HPE
12:00
20m
Talk
TurboFFT: Co-Designed High-Performance and Fault-Tolerant Fast Fourier Transform on GPUs
Main Conference
Shixun Wu , Yujia Zhai NVIDIA Corporation, Jinyang Liu University of California, Riverside, Jiajun Huang University of California, Riverside, Zizhe Jian University of California, Riverside, Huangliang Dai University of California, Riverside, Sheng Di Argonne National Laboratory, Franck Cappello Argonne National Laboratory, zizhong chen University of California, Riverside