This paper presents a parallel LU factorization algorithm designed to take advantage of physical broadcast communication facilities as well as overlapping of communication and computing. Physical broadcast is directly available on Ethernet networks hardware, one of the most used interconnection networks in current clusters installed for parallel computing. Overlapped communication is a well-known strategy for hiding communication latency, which is one of the most common source of parallel performance penalization. Performance analysis and experimentation of the proposed parallel LU factorization algorithm are presented. Also, the performance of the proposed algorithm is compared with that of the algorithm used in ScaLAPACK (Scalable LAPACK), which is commonly accepted as having optimized performance.
Notas
Lecture Notes in Computer Science book series (LNTCS, vol. 3648)
Información general
Fecha de exposición:2005
Fecha de publicación:2005
Idioma del documento:Español
Evento:11th International Euro-Par Conference (Lisbon, Portugal, August 30-September 2, 2005)
Institución de origen:Instituto de Investigación en Informática
Excepto donde se diga explícitamente, este item se publica bajo la siguiente licencia Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)