Artigo Revisado por pares

Minimizing synchronization in IDR ( s )

2011; Wiley; Volume: 18; Issue: 5 Linguagem: Inglês

10.1002/nla.764

ISSN

1099-1506

Autores

T.P. Collignon, Martin B. van Gijzen,

Tópico(s)

Electromagnetic Scattering and Analysis

Resumo

Numerical Linear Algebra with ApplicationsVolume 18, Issue 5 p. 805-825 Research Article Minimizing synchronization in IDR (s) Tijmen P. Collignon, Tijmen P. Collignon Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Mekelweg 4, 2628 CD, Delft, The NetherlandsSearch for more papers by this authorMartin B. van Gijzen, Corresponding Author Martin B. van Gijzen [email protected] Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Mekelweg 4, 2628 CD, Delft, The Netherlands Associate Professor.Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Mekelweg 4, 2628 CD, Delft, The Netherlands===Search for more papers by this author Tijmen P. Collignon, Tijmen P. Collignon Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Mekelweg 4, 2628 CD, Delft, The NetherlandsSearch for more papers by this authorMartin B. van Gijzen, Corresponding Author Martin B. van Gijzen [email protected] Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Mekelweg 4, 2628 CD, Delft, The Netherlands Associate Professor.Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Mekelweg 4, 2628 CD, Delft, The Netherlands===Search for more papers by this author First published: 14 January 2011 https://doi.org/10.1002/nla.764Citations: 8 Read the full textAboutPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onEmailFacebookTwitterLinkedInRedditWechat Abstract IDR (s) is a family of fast algorithms for iteratively solving large nonsymmetric linear systems. With cluster computing and in particular with Grid computing, the inner product is a bottleneck operation. In this paper, three techniques are investigated for alleviating this bottleneck. First, a recently proposed IDR (s) algorithm that is highly efficient and stable is reformulated in such a way that it has a single global synchronization point per iteration step. Second, the so-called test matrix is chosen so that the work, communication, and storage involving this matrix is minimized in multi-cluster environments. Finally, a methodology is presented for a-priori estimation of the optimal value of s using only problem and machine-based parameters. Numerical experiments applied to a 3D convection–diffusion problem are performed on the DAS-3 Grid computer, demonstrating the effectiveness of our approach. Copyright © 2011 John Wiley & Sons, Ltd. REFERENCES 1 Sonneveld P, van Gijzen MB. IDR(s): a family of simple and fast algorithms for solving large nonsymmetric linear systems. SIAM Journal on Scientific Computing 2008; 31(2): 1035–1062. 2 Simoncini V, Szyld DB. Interpreting IDR as a Petrov–Galerkin method. SIAM Journal on Scientific Computing 2010; 32(4): 1898–1912. DOI: 10.1137/090774756. 3 Gutknecht MH. IDR Explained. Electronic Transactions on Numerical Analysis 2010; 36: 126–148. 4 Sleijpen GLG, Sonneveld P, van Gijzen MB. Bi-CGSTAB as an induced dimension reduction method. Applied Numerical Mathematics 2009; In Press, Corrected Proof, DOI: 10.1016/j.apnum.2009.07.001. 5 Sleijpen GLG, van Gijzen MB. Exploiting BiCGstab(ℓ) strategies to induce dimension reduction. SIAM Journal on Scientific Computing 2010; 32(5): 2687–2709. DOI: 10.1137/090752341. 6 Onoue Y, Fujino S, Nakashima N. Improved IDR(s) method for gaining very accurate solutions. World Academy of Science, Engineering and Technology 2009; 55: 520–525. 7 Onoue Y, Fujino S, Nakashima N. An overview of a family of new iterative methods based on IDR theorem and its estimation. Proceedings of the International MultiConference of Engineers and Computer Scientists, IMECS 2009, 18–20 March 2009, Hong Kong, vol. II, 2009; 2129–2134. 8 Sonneveld P. On the convergence behaviour of IDR(s). Technical Report, Delft University of Technology, Delft, The Netherlands, 2010, DUT report 10-08. 9 Sonneveld P. On the statistical properties of solutions of completely random linear systems. Technical Report, Delft University of Technology, Delft, The Netherlands, 2010. DUT report 10-09. 10 Gutknecht MH, Zemke JPM. Eigenvalue computations based on IDR. Technical Report, Seminar für Angewandte Mathematik, ETH Zürich, SAM, ETH Zürich, Switzerland 2010. Research Report No. 2010–13. 11 van der Vorst HA. Bi–CGSTAB: a fast and smoothly converging variant of Bi–CG for the solution of nonsymmetric linear systems. SIAM Journal on Scientific and Statistical Computing 1992; 13(2): 631–644. 12 van Gijzen MB, Sonneveld P. An elegant IDR(s) variant that efficiently exploits bi-orthogonality properties. Technical Report, Delft University of Technology, Delft, The Netherlands 2010. DUT report 10–16 (revised version of DUT report 08–21). 13 Seinstra FJ, Verstoep K. DAS-3: the distributed ASCI supercomputer 3 2007. Available from:http://www.cs.vu.nl/das3/. 14 Chronopoulos AT, Gear CW. S-step iterative methods for symmetric linear systems. Journal of Computational and Applied Mathematics 1989; 25(2): 153–168. DOI: http://dx.doi.org/10.1016/0377-0427(89)90045-9. 15 de Sturler E. A performance model for Krylov subspace methods on mesh-based parallel computers. Parallel Computing 1996; 22(1): 57–74. DOI: http://dx.doi.org/10.1016/0167-8191(95)00057-7. 16 Gu TX, Zuo XY, Liu XP, Li PL. An improved parallel hybrid bi–conjugate gradient method suitable for distributed parallel computing. Journal of Computational and Applied Mathematics 2009; 226(1): 55–65. DOI: http://dx.doi.org/10.1016/j.cam.2008.05.017. 17 Gu TX, Zuo XY, Zhang LT, Zhang WQ, Sheng Z. An improved bi-conjugate residual algorithm suitable for distributed parallel computing. Applied Mathematics and Computation 2007; 186(2): 1243–1253. 18 Yang TR. The improved CGS method for large and sparse linear systems on bulk synchronous parallel architecture. ICA3PP '02: Proceedings of the Fifth International Conference on Algorithms and Architectures for Parallel Processing. IEEE Computer Society: Washington, DC, U.S.A., 2002; 232–237. 19 Yang L, Brent R. The improved BiCGStab method for large and sparse unsymmetric linear systems on parallel distributed memory architectures. Fifth International Conference on Algorithms and Architectures for Parallel Processing. IEEE Computer Society: Los Alamitos, CA, U.S.A., 2002; 324–328. DOI: http://doi.ieeecomputersociety.org/10.1109/ICAPP.2002.1173595. 20 Yang T, Lin HX. The improved quasi-minimal residual method on massively distributed memory computers. HPCN Europe '97: Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking. Springer: London, U.K., 1997; 389–399. 21 Krasnopolsky B. The reordered BiCGStab method for distributed memory computer systems. Procedia Computer Science 2010; 1(1): 213–218. DOI: 10.1016/j.procs.2010.04.024.ICCS 2010. 22 Valiant LG. A bridging model for multi-core computing. ESA '08: Proceedings of the 16th annual European Symposium on Algorithms. Springer: Berlin, Heidelberg, 2008; 13–28. DOI: http://dx.doi.org/10.1007/978-3-540-87744-8_2. 23 Bal H. DAS-3 Opening Symposium 2007. Available from: http://www.cs.vu.nl/das3/symposium07/das3-bal.pdf. Retrieved 05/02/2009. 24 Petitet A, Whaley RC, Dongarra J, Cleary A. HPL—a portable implementation of the high–performance Linpack benchmark for distributed-memory computers 2008. Available at http://www.netlib.org/benchmark/hpl/. 25 Verstoep K, Maassen J, Bal HE, Romein JW. Experiences with fine-grained distributed supercomputing on a 10G testbed. CCGRID '08: Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID). IEEE Computer Society: Washington, DC, U.S.A., 2008; 376–383. DOI: http://dx.doi.org/10.1109/CCGRID.2008.71. 26 Sleijpen G, Fokkema D. BiCGstab(ℓ) for linear equations involving unsymmetric matrices with complex spectrum. Electronic Transactions on Numerical Analysis 1993; 1: 11–32. Citing Literature Volume18, Issue5October 2011Pages 805-825 ReferencesRelatedInformation

Referência(s)