Skip to content
Show report in:

UMINF 10.12

Efficient Reduction from Block Hessenberg Form to Hessenberg Form using Shared Memory

A new cache-efficient algorithm for reduction from block Hessenberg form to Hessenberg form is presented and evaluated. The algorithm targets parallel computers with shared memory. One level of look-ahead in combination with a dynamic load-balancing scheme significantly reduces the idle time and allows the use of coarse-grained tasks. The coarse tasks lead to high-performance computations on each processor/core. Speedups close to 13 over the sequential unblocked algorithm have been observed on a dual quad-core machine using one thread per core.


No keywords specified


Back Edit this report
Entry responsible: Lars Karlsson

Page Responsible: Frank Drewes