Modeling Emerging Memory-Divergent GPU Applications
2019; Institute of Electrical and Electronics Engineers; Volume: 18; Issue: 2 Linguagem: Inglês
10.1109/lca.2019.2923618
ISSN2473-2575
AutoresLu Wang, Magnus Jahre, Almutaz Adileh, Zhiying Wang, Lieven Eeckhout,
Tópico(s)Cloud Computing and Resource Management
ResumoAnalytical performance models yield valuable architectural insight without incurring the excessive runtime overheads of simulation. In this work, we study contemporary GPU applications and find that the key performance-related behavior of such applications is distinct from traditional GPU applications. The key issue is that these GPU applications are memory-intensive and have poor spatial locality, which implies that the loads of different threads commonly access different cache blocks. Such memory-divergent applications quickly exhaust the number of misses the L1 cache can process concurrently, and thereby cripple the GPU's ability to use Memory-Level Parallelism (MLP) and Thread-Level Parallelism (TLP) to hide memory latencies. Our Memory Divergence Model (MDM) is able to accurately represent this behavior and thereby reduces average performance prediction error by 14× compared to the state-of-the-art GPUMech approach across our memory-divergent applications.
Referência(s)