Parallel SpGEMM
From Cs240aproject
This is the Spring 2007 project of Aydin Buluc and Fenglin Liao.
Project Description
The project is to write a 2D parallel implementation of the sparse matrix times sparse matrix routine. It is projected to be implemented in C++ using MPI. The project will have multiple iterations:
1) The simplest decomposition is the regular block checkerboard partitioning [1]. Here we assume assumes the existence of a fast sequential algorithm for sparse matrix multiplication that is used as a block box.
2) The second way to decompose to data is to try assigning equal number of edges to each processor. The initial idea is to map each non-zero of the matrix to a point in the rectangular (m x n) region and use a quadtree[2] data structure. Here one may or may not use a sequential matrix multiplication as block box [Yet to be explored].
3) The third and the hardest way is to assign edges to processors such that the following metric is minimized.
Failed to parse (Can't write to or create math temp directory): min(max_i(communication(i) + computation(i)))
In other words, we try to load balance across the processors. The load balancing is generally done using graph partitioning but it is unclear how to do it for Parallel SpGEMM. Whether this load balancing is going to be done statically or dynamically is yet to be determined.