Date of Award

7-2004

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Engineering and Sciences

First Advisor

Charles Fulton

Second Advisor

R. G. Deshmukh

Third Advisor

Eraldo Ribeiro

Fourth Advisor

William Shoaff

Abstract

Sparse-matrix/dense-vector multiplication algorithms are not as highly developed as algorithms for dense matrices. Dense matrix multiplication algorithms have been made efficient by exploiting data locality, parallelism, pipelining, and other types of optimization. Sparse matrix algorithms, on the other hand, encounter low or no data locality, indirect addressing, and no easy way to exploit parallelism. In an effort to achieve savings in storage and computational time, the topic of sparse matrix representation is often revisited. The first contribution of this thesis is the introduction of a new representation for sparse matrices. This representation is called here the Reduced Index Sparse (RIS) representation because an ordered pair (j,v) with a single index j is used for every nonzero matrix element. This considerably reduces the required disk storage from that needed by other representations like coordinate (COO) format, which uses an ordered triple (aij, i, j) with two indices, i and j for every nonzero matrix element. The second contribution of this thesis is a modified block cyclic data distribution for sparse matrices with arbitrary nonzero structure. And the third contribution is the implementation of this distribution using RIS to perform a parallel sparse-matrix / dense-vector multiplication on a distributed memory computer using MPI. The implementation achieved good load balancing for sparse matrices of different sparsity patterns. Software was written to generate large random sparse matrices having well preserved sparsity patterns. The desired number of nonzero elements and matrix density are user input. The sparsity pattern is also controlled by user input. Performance of the new RIS storage scheme was measured against SPARSKIT routines. Efficiency and scalability of the parallel sparse-matrix/dense-vector multiplication was generally good and timing analysis shows the new RIS scheme is competitive.

Comments

Copyright held by author

Share

COinS