Triangular linear systems are fundamental in numerical linear algebra. A triangular linear system has a straight-forward and efficient solution strategy, namely forward substitution for lower triangular systems and backward substitution for upper triangular systems. Triangular systems, or, more generally, systems of triangular type occur frequently in algorithms for more complex problems. This thesis addresses three systems that involve linear systems of triangular type.
The first system concerns quasi-triangular matrices. Quasi-triangular matrices are block triangular with 1-by-1 and 2-by-2 blocks on the diagonal. Quasi-triangular systems arise in the computation of eigenvectors from the real Schur form for the non-symmetric eigenvalue problem. This thesis contributes two algorithms for the eigenvector computation, which solve shifted quasi-triangular linear systems in an efficient and scalable way.
The second system addresses scaled triangular linear systems. During the solution of a triangular linear system, the entries of the solution can grow. This growth can exceed the representable range of floating-point numbers. Such an overflow can be avoided by solving a scaled triangular system. The solution is scaled prior to every operation that would otherwise result in an overflow. After scaling, the operations can be executed safely. This thesis analyzes the scalability of a recently developed tiled, robust solver for scaled triangular systems, which ensures that at no point in the computation the overflow threshold is exceeded.
The third system tackles the scaled continuous-time triangular Sylvester equation, which couples two quasi-triangular matrices. The solution process is prone to overflow. This thesis contributes a robust, tiled solver and demonstrates its practicability.
These three systems can be addressed with a variation of forward or backward substitution. Compared to the highly optimized and scalable implementations of standard forward and backward substitution available in HPC libraries, the existing implementations of these three systems run at a smaller fraction of the peak performance. This thesis presents techniques to improve on the performance and robustness of the implementations of the three systems.