This paper from the upcoming book ShaderX 2 Programming by Ádám Moravánszky gives a detailed description of implementing dense matrix operations on programmable GPUs. Matrix multiplication is applied to solving linear systems of equations and the linear complementarity problem, which can in turn be used to simulate soft body and rigid body physics. The performance of the GPU implementation is compared to the SSE2 optimized ATLAS library running on the CPU. DirectX 9 pixel and vertex shader programs are provided. (Dense Matrix Algebra on the GPU. Ádám Moravánszky. To appear in ShaderX 2 Programming, Wordware, 2003.)