We develop a task-based parallel algorithm for reordering eigenvalues of matrices in real Schur form. We describe how we implemented the algorithm using StarPU runtime system and report on experiments performed on a shared memory machine. Compared with ScaLAPACK we achieve average speedup of 3. We have strong and weak scaling efficiencies which are well above 50%. We are able to achieve more than 50% of the peak flop rate for all but the smallest matrices. The idle time and the overhead is negligible except for the smallest matrices. The next step is to reconfigure and further develop the code so that it can be applied to matrix pairs in generalized Schur forms and run efficiently on distributed memory machines.
Page Responsible: Frank Drewes 2024-11-21