Capítulo de livro Acesso aberto Revisado por pares

Scalable Detection of MPI-2 Remote Memory Access Inefficiency Patterns

2009; Springer Science+Business Media; Linguagem: Inglês

10.1007/978-3-642-03770-2_10

ISSN

1611-3349

Autores

Marc-André Hermanns, Markus Geimer, Bernd Mohr, Felix Wolf,

Tópico(s)

Distributed systems and fault tolerance

Resumo

Wait states in parallel applications can be identified by scanning event traces for characteristic patterns. In our earlier work, we have defined such patterns for mpi-2 one-sided communication, although still based on a trace-analysis scheme with limited scalability. Taking advantage of a new scalable trace-analysis approach based on a parallel replay, which was originally developed for mpi-1 point-to-point and collective communication, we show how wait states in one-sided communications can be detected in a more scalable fashion. We demonstrate the scalability of our method and its usefulness for the optimization cycle with applications running on up to 8,192 cores.

Referência(s)