Artigo Acesso aberto Revisado por pares

Recent developments in high‐performance computing and simulation: distributed systems, architectures, algorithms, and applications

2015; Wiley; Volume: 27; Issue: 9 Linguagem: Inglês

10.1002/cpe.3523

ISSN

1532-0634

Autores

Waleed W. Smari, Sandro Fiore, Carsten Trinitis,

Tópico(s)

Parallel Computing and Optimization Techniques

Resumo

High-performance computing (HPC) helps address real-world problems by providing strong environments and support to run data and computational intensive algorithms, complex numerical simulations, and parallel scientific codes. However, the increasing needs to tackle high-resolution exascale, complex systems, and multi-scale data-intensive sciences require dealing with new computing platforms with millions of cores and more complex software and hardware solutions. In such a landscape, key requirements on concurrency, energy, storage, I/O, resiliency, networking, and so on need to be re-examined and investigated at different levels. This special issue contains 10 papers representing recent advances in the area of high-performance computing and simulation. These extended papers were carefully selected from the proceedings of the 2011 International Conference on High Performance Computing and Simulation (HPCS 2011), which was held in Istanbul, Turkey, July 4–8, 2011. The invited papers in this special issue represent fully refereed augmented works originally presented at the conference, which cover some of the main contemporary topics and challenges in various areas of HPC and simulation. The International High Performance Computing and Simulation (HPCS) Conference Series is meant to address, explore, and exchange information on the state-of-the-art in high-performance and large-scale computing systems, their use in modeling and simulation, their design, performance and utilization, and their applications and impact. Typically, the conference includes invited presentations by experts from academia, industry, and government laboratories and institutions as well as contributed paper presentations describing refereed original work on the current state of research in the areas in the preceding text and other related ones such as services computing, cloud and grid computing, and mobile computing. Annually, participation is extended to researchers, designers, educators, and interested parties in all HPCS disciplines and specialties to partake in various functions and contribute to activities such as tutorials, demos, exhibits, posters, panels, and doctoral dissertation colloquia. Over the years, the conference invited some of the top experts and innovators in the field as keynote speakers or for plenary talks and tutorial sessions. Along with the main track, several symposia, workshops, and special sessions are organized every year in conjunction with this meeting. Since 2009, the conference proceedings have been published by IEEE and included in the IEEE Xplore Digital Library 1-3 and indexed accordingly in several major indexing services 4. Prior to that, proceedings were published and indexed by the European Council for Modelling and Simulation (ECMS) (e.g., 5, 6). The conference continues to experience a healthy growth and improved quality contributions. The International HPCS Conference Series started in 2003 as the High Performance & Large Scale Computing (HP&LSC) Track in conjunction with the European Council for Modelling and Simulation 2003 Conference and was initially held in Nottingham, UK. With the first event being a great success, follow-up meetings were held in Magdeburg, Germany (2004); Riga, Latvia (2005); Bonn, Germany (2006); Prague, Czech Republic (2007); Nicosia, Cyprus (2008); Leipzig, Germany (2009); and Caen, France (2010). The Ninth HPCS Conference was held in Istanbul, Turkey, in the summer of 2011 on which this special issue is based. The conference has been organized in technical cooperation with major professional organizations as such Association for Computing Machinery (ACM), The Institute of Electrical and Electronics Engineers (IEEE), and International Federation for Information Processing (IFIP) as well as academic institutions and research centers, and many regional organizations. The first conference-wide special issue was organized based on the HPCS 2009 meeting in Wiley's Concurrency and Computation: Practice and Experience Journal. The set of the final papers published is available from Wiley and also online 7. The second special issue was organized the following year based on the HPCS 2010 meeting in Caen, France. The set of the final papers published is available from Wiley and also online 8. As such, the current conference-wide special issue is the third one, and it is based on selected extended papers from the Ninth HPCS Conference (HPCS 2011), which was held in Istanbul, Turkey, during the days July 4– 8, 2011. In addition to these, tracks, workshops, and special sessions also organized special issues on their respective areas, some of which have been published already 9-12. This conference-wide third special issue consists of selected papers that were compiled from the conference's main track as well as its adjoining workshops and special sessions. The chosen papers embrace several of the contemporary state-of-the-art research issues, such as many-core computing, heterogeneous architectures, cloud computing, self-aware systems, and real-time requirements along with an array of applications with HPC provisions. Authors of 31 papers from the conference proceedings were invited to submit an extended and updated version of their original paper to this special issue. The selection was made by the conference international program committee and workshops and special sessions organizers. In the initial round, we received 16 affirmative responses/proposals, which were reviewed for approval. An international reviewing committee of about 50 experts in various subjects of HPCS was formed. The committee members came from 15 different countries. Each paper was assigned to at least five reviewers. In the first round, 14 manuscripts were submitted for the reviewing process, and 11 papers passed with minor or major revisions required, while three were rejected. The 11 revised manuscripts were revised and again submitted for round two of the reviewing process. After carefully taking the reviewers' remarks and recommendations into account, the outcome this time was 10 papers were passed with minor revisions or accepted, and one was rejected. In round three, the 10 papers were reviewed again, and some minor improvements were requested for most papers. Only two manuscripts required a fourth round of reviews. At the end of this elaborate and thorough review process, we have the 10 manuscripts that you find in this special issue. With the help of our reviewers and a moderate turnaround time in each round, we managed to receive the reviews on time and forward them to the authors. The authors met the strict deadlines we set forth for each cycle and submitted their revised manuscripts in a timely manner as well. This special issue of HPCS papers comprises of 10 contributions from 31 authors from seven different countries, namely, Germany, Italy, Portugal, Romania, Spain, the UK, and the USA, covering research ranging from graph-based approaches, to tuning applications, to state-of-the-art processor architectures, to cloud computing. Basically, the papers in this special issue can be categorized into five main subjects: system- and hardware-oriented papers, out of which category, three papers have been selected for publication here; grid/cloud computing related papers, out of which category, two manuscripts have been selected for publication here; papers dealing with real-time requirements, out of which category, one paper has been selected for publication here; papers on autonomous-reflexive systems, out of which category, two manuscripts have been selected for publication here; and applications-oriented papers, out of which category, two papers have been selected for publication in this special issue. We summarize these next. ‘Finding Near-Perfect Parameters for Hardware and Code Optimizations by Automatic Multi-Objective Design Space Explorations’ by Ralf Jahr, Horia Calborean, Lucian Vintan, and Theo Ungerer 13 introduces FADSE, a design space exploration tool that automatically finds nearly optimal processor design configurations for a given code. Taking into account that performance is no longer the only objective subject to optimization (others comprise power consumption, area, etc.), FADSE tries to find an optimum architecture across multiple objective functions. As an example, the Grid ALU Processor (GAP) and its post-link optimizer GAPtimize are used to demonstrate the feasibility of the approach. ‘Cache-oblivious Matrix Algorithms in the Age of Multi- and Many-Cores’ by Alexander Heinecke and Carsten Trinitis 14 highlights the issue of increasing vector unit width that goes along with increasing core counts on x86 processor architectures. To demonstrate this, a cache-oblivious numerical code has been ported to and optimized on four contemporary x86 architectures representing vector unit widths from 128 to 512 bits. The article discusses the obtained performance results and compares them with the vendors' architecture specific and optimized libraries Math Kernel Library (MKL) and AMD Core Math Library (ACML). A special emphasis is put on providing insights into architectural properties of state-of-the-art processor and accelerator architectures. ‘New System Software for Parallel Programming Models on the Intel SCC Many-core Processor’ by Carsten Clauss, Stefan Lankes, Pablo Reble, and Thomas Bemmerl 15 gives a detailed report on the authors' experiences with implementing parallel programming libraries and tools for the 48-core Intel Single Cloud Chip (SCC) processor. SCC is a prototype of a many-core processor comprising noncoherent memory-coupled cores, a so-called cluster-on-chip architecture. The programming library developed by the authors reflects an SCC-customized Message Passing Interface (MPI) library called SCC-MPICH for distributed memory parallel programming and a shared virtual memory system called MetalSVM for the thread programming. In case of the SCC chip, both approaches are evaluated, and it is shown how these can be optimized for such a novel cluster-on-chip architecture. ‘Cost Optimization of Virtual Infrastructures in Dynamic Multi-Cloud Scenarios’ by Jose Luis Lucas Simarro, Rafael Moreno-Vozmediano, Ruben S. Montero, and Ignacio M. Llorente 16 presents a so-called cloud broker architecture: an architecture that is responsible for deploying virtual resources (virtualized servers) across compute clouds. By taking into account migration overhead costs in a dynamic cloud scenario, several use cases are investigated, demonstrating that using brokering mechanisms in dynamic deployments shows clear advantages over static deployments in cloud environments. From the users' point of view, multiple factors such as pricing schemes, types of instance, or value-added features need to be taken into account, which is why cloud brokering comes into play. ‘Interoperating Grid Infrastructures with the GridWay Metascheduler’ by Ismael Marin Carrion, Eduardo Huedo, and Ignacio M. Llorente 17 describes GridWay, a metascheduler for sharing compute resources within common grid middleware, which was developed by the authors. Latest features comprise enhancements with regard to interoperability and interoperation, which is achieved by introducing a modular architecture design. Two new execution drivers and a new remote interface have been added to GridWay, which is described in detail in the paper. ‘Improved Real-Time Scheduling for Periodic Tasks on Multiprocessors’ by Prapaporn Rattanatamrong and Jose A. B. Fortes 18 presents a novel algorithm for scheduling applications with real-time requirements to supercomputers. Methods to ensure that all resources can be optimally utilized are provided in the paper. This is demonstrated by an application dealing with a human brain-machine interface, a simulation of a prosthetic limb's movement according to activities of input signals. ‘Towards Self-Caring IT Systems: A Study of Performance Penalties under Faults’ by Selvi Kadirvel and José A. B. Fortes 19 is from the area of fault tolerance and utilizes virtualization techniques: taking MapReduce frameworks as an example, it is shown that the performance penalty imposed by fault tolerance mechanisms can not be neglected. Hence, for the open source MapReduce framework Hadoop, this execution time penalty is evaluated by using a simulator. Further investigations are carried out in a virtual environment with varying characteristics regarding hardware, application, data set, and types of fault. The obtained parameter studies show that penalties can be significantly reduced through dynamic resource scaling. ‘AOI-Cast in Distributed Virtual Environments: An Approach based on Delay Tolerant Reverse Compass Routing’ by Laura Ricci, Luca Genovali, Emanuele Carlini, and Massimo Coppola 20 deals with a novel area of interest (AOI)-cast algorithm for distributed environments such as massively multiplayer online games (MMOGs). Through exploiting the mathematical properties of Delaunay Triangulations, a spanning tree supporting event notification within the area of interest can be built. This tree is computed by reverse compass routing. The efficiency of this novel approach is demonstrated through a set of simulations with both artificial and real data from an MMOG. ‘Implementation and Performance Analysis of Efficient Index Structures for DNA Search Algorithms in Parallel Platforms’ by Gustavo Encarnacão, Nuno Sebastião, and Nuno Roma 21 is a paper from the area of bioinformatics. In DNA sequence alignment, it is of crucial importance to choose an appropriate local alignment algorithm in order to achieve reasonable performance. The authors present an analysis of three highly optimized implementations of index-based search algorithms, namely, suffix-trees, suffix-arrays, and hash tables of q-mers. For all three, a performance comparison is carried out on CPU-based and Graphics Processing Unit (GPU) based architectures. On both architectures, it is shown that suffix-trees and suffix-arrays perform significantly better than hash tables of q-mers. ‘Parallel Multigrid on Hierarchical Hybrid Grids: A Performance Study on Current HPC Clusters’ by Björn Gmeiner, Harald Köstler, Markus Stürmer, and Ulrich Rüde 22 investigates the performance of a geometric multigrid solver on up-to-date high-performance computing cluster installations, namely a BlueGene/P cluster run by the Julich Supercompouting center and an Intel Xeon 5650 cluster run by the Erlangen regional computing center (RRZE). The geometric multigrid solver executes inside a software package called hierarchical hybrid grids (HHGs). HHG is a package based on unstructured tetrahedral finite elements. The obtained performance is evaluated and compared with that obtained when using a standard multigrid solver so that an estimate can be given as to whether it is worth using numerical packages like HHG. It is our hope that the collection of manuscripts in this special issue will make a significant contribution to the HPC systems and modeling and simulation fields and their future developments. The guest editors of this special issue on High-performance Computing Systems would like to thank all authors, the special issue reviewing committee, the CPE EIC, Prof. G. C. Fox, and the editorial staff of Wiley for their contributions, efforts, and support in making this special issue possible. It would not have been possible without their support and guidance. The special issue Reviewing Committee members are Giovanni Aloisio (Italy), Andres Avila (Chile), Bruno Bachelet (France), Liz Bacon (UK), Mostafa Bamha (France), Francoise Baude (France), Milan Bradonjic (USA), Ivona Brandic (Austria), Mathieu Chapelle (France), Camille Coti (France), Alfredo Cuzzocrea (Italy), Laurent d'Orazio (France), Luciano Antonio Digiampietri (Brazil), Daniel Etiemble (France), Joel Falcou (France), Bernhard Fechner (Germany), Cecile Germain (France), Alain Giulieri (France), David Gregg (Ireland), Mark Hedges (UK), Alexander Heinecke (Germany), Gonzalo Hernandez (Chile), David Hill (France), Neil Chue Hong (UK), Udo Hönig (Germany), Zhihi Huang (New Zealand), Eric Innocenti (France), Hai Jin (China), David Kaeli (USA), Al Kellie (USA), Harald Köstler (Germany), Dieter Kranzlmüller (Germany), Sébastien Limet (France), Frederic Loulergue (France), Olivier Marin (France), Emmanuel Melin (France), Lizandro Muzy (France), Mariusz Nowostawski (New Zealand), Domenico Potena (Italy), T. K. Prasad (USA), Desh Ranjan (India), Mukesh Singhal (USA), Anna Squicciarini (USA), Domenico Talia (Italy), Lorenzo Verdoscia (Italy), Timothy J. Williams (USA), Chao Tung Yang (Taiwan), Vesna Zeljkovic (China), and Ji Zhang (Australia).

Referência(s)