Artigo Acesso aberto Revisado por pares

Design and Evaluation in a Real Use-case of Closed-loop Scheduling Algorithms for the gLite Workload Management System

2011; IOP Publishing; Volume: 331; Issue: 6 Linguagem: Inglês

10.1088/1742-6596/331/6/062029

ISSN

1742-6596

Autores

Paolo Andreetto, M. Bauce, Sara Bertocco, Fabio Capannini, Marco Cecchi, G. Compostella, A. Dorigo, E Frizziero, F. Giacomini, A. Gianelle, D. Lucchesi, M. Mezzadri, Salvatore Monforte, F. Prelz, Elisa Molinari, D Rebatto, M. Sgaravatto, L. Zangrando,

Tópico(s)

Advanced Data Storage Technologies

Resumo

The High Throughput Computing paradigm typically involves a scenario whereby a given, estimated processing power is made available and sustained by the computing environment over a medium/long period of time. As a consequence, the performance goals are in general targeted at maximizing resource utilization to obtain the expected throughput, rather than minimizing run time for individual jobs. This does not mean that optimal resource selection through adequate workload management is not desired nor effective, nonetheless, relatively small and pre-assessed percentages of suboptimal choices or unexpected events can be tolerated. However, there are use-cases, among the HEP community, for which the described model does not immediately fit. This paper deals with the workload needs primarily driven by the Collider Detector at Fermilab (CDF) experimental collaboration. In particular, the CDF analysis facility (CAF) typically operates by splitting its computations into so-called sections, which can be seen as sets of uniform and independent jobs. Processing a section cannot be considered completed until all _its jobs have been successfully executed, thus requiring a Minimum Completion Time (MCT) dynamic scheduling policy where not even a single job should lay in non-terminal Grid states. A significant part of the CDF analysis is processed on the European Grid infrastructure through the gLite Workload Management System (WMS) [2]. This paper describes the design enhancements and ranking algorithms the WMS has been provided with to implement an adaptive scheduling policy to minimise MCT. Case study, outlined approach and first results are presented.

Referência(s)