Skip to content
printicon
Show report in:

UMINF 09.06

A novel parallel QR algorithm for hybrid distributed memory HPC systems

A novel variant of the parallel QR algorithm for solving dense nonsymmetric eigenvalue problems on hybrid distributed high performance computing (HPC) systems is presented. For this purpose, we introduce the concept of multi-window bulge chain chasing and parallelize aggressive early deflation. The multi-window approach ensures that most computations when chasing chains of bulges are performed in level 3 BLAS operations, while the aim of aggressive early deflation is to speed up the convergence of the QR algorithm. Mixed MPI-OpenMP coding techniques are utilized for porting the codes to distributed memory platforms with multithreaded nodes, such as multicore processors. Numerous numerical experiments confirm the superior performance of our parallel QR algorithm in comparison with the existing ScaLAPACK code, leading to an implementation that is one to two orders of magnitude faster for sufficiently large problems, including a number of examples from applications.

Keywords

Eigenvalue problem, nonsymmetric QR algorithm, multishift, bulge chasing, parallel computations, level 3 performance, aggressive early deflation, parallel algorithms, hybrid distributed memory systems

Authors

Robert Granat, Bo Kågström and Daniel Kressner

Back Edit this report
Entry responsible: Account Deleted - might not work

Page Responsible: Frank Drewes
2024-11-21