Evaluation of Performance Unfairness in NUMA System Architecture

Artigo Revisado por pares

Evaluation of Performance Unfairness in NUMA System Architecture

2016; Institute of Electrical and Electronics Engineers; Volume: 16; Issue: 1 Linguagem: Inglês

10.1109/lca.2016.2602876

ISSN

2473-2575

Autores

Wonjun Song, Hyungjoon Jung, Jung Ho Ahn, Jae W. Lee, John Kim,

Tópico(s)

Advanced Memory and Neural Computing

Resumo

NUMA (Non-uniform memory access) system architectures are commonly used in high-performance computing and datacenters. Within each architecture, a processor-interconnect is used for communication between the different sockets and examples of such interconnect include Intel QPI and AMD HyperTransport. In this work, we explore the impact of the processor-interconnect on overall performance-in particular, we explore the impact on performance fairness from the processor-interconnect arbitration. It is well known that locally-fair arbitration does not guarantee globally-fair bandwidth sharing as closer nodes receive more bandwidth in a multi-hop network. However, this paper is the first to demonstrate the opposite can occur in a commodity NUMA servers where remote nodes receive higher bandwidth (and perform better). This problem occurs because router micro-architectures for processor-interconnects commonly employ external concentration. While accessing remote memory can occur in any NUMA system, performance unfairness (or performance variation) is more critical in cloud computing and virtual machines with shared resources. We demonstrate how this unfairness creates significant performance variation when executing workload on the Xen virtualization platform. We then provide analysis using synthetic workloads to better understand the source of unfairness.

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Evaluation of Performance Unfairness in NUMA System Architecture