Artigo Acesso aberto Revisado por pares

Modeling Emerging Memory-Divergent GPU Applications

2019; Institute of Electrical and Electronics Engineers; Volume: 18; Issue: 2 Linguagem: Inglês

10.1109/lca.2019.2923618

ISSN

2473-2575

Autores

Lu Wang, Magnus Jahre, Almutaz Adileh, Zhiying Wang, Lieven Eeckhout,

Tópico(s)

Cloud Computing and Resource Management

Resumo

Analytical performance models yield valuable architectural insight without incurring the excessive runtime overheads of simulation. In this work, we study contemporary GPU applications and find that the key performance-related behavior of such applications is distinct from traditional GPU applications. The key issue is that these GPU applications are memory-intensive and have poor spatial locality, which implies that the loads of different threads commonly access different cache blocks. Such memory-divergent applications quickly exhaust the number of misses the L1 cache can process concurrently, and thereby cripple the GPU's ability to use Memory-Level Parallelism (MLP) and Thread-Level Parallelism (TLP) to hide memory latencies. Our Memory Divergence Model (MDM) is able to accurately represent this behavior and thereby reduces average performance prediction error by 14× compared to the state-of-the-art GPUMech approach across our memory-divergent applications.

Referência(s)