Compressed Sparse Row
Yeshwanth Nagaraj
Democratizing Math and Core AI // Levelling playfield for the future
The Compressed Sparse Row (CSR) format is a common way to store sparse matrices in a space-efficient manner. This format is particularly useful for storing matrices where most of the elements are zero, such as adjacency matrices for graphs with few edges.
An adjacency matrix for a graph G(V,E) with V vertices and E edges is a V×V matrix A where A[i][j]=1 if there is an edge from vertex i to vertex j and A[i][j]=0 otherwise. If the graph is undirected, the matrix is symmetric.
Compressed Sparse Row (CSR) Format
The CSR format uses three one-dimensional arrays to represent a sparse matrix:
For example, let's consider an adjacency matrix for a graph with 4 vertices:
CSR Format
The Row Pointer Array starts with 0 (indicating the start position in the Value array for row 0). The ith element of the Row Pointer Array points to the index in the Value Array where the ith row starts. The last element in the Row Pointer Array is the total number of non-zero elements in the matrix.
These arrays provide a more space-efficient way to store and manipulate sparse matrices, making operations like matrix-vector multiplication more efficient compared to using a dense representation.
Here's a simple Python code snippet that converts a given dense matrix (2D list) into its Compressed Sparse Row (CSR) format. The CSR format consists of three arrays:
def dense_to_csr(matrix):
values = []
col_indices = []
row_ptr = [0]
for row in matrix:
count = 0
for col, val in enumerate(row):
if val != 0:
values.append(val)
col_indices.append(col)
count += 1
row_ptr.append(row_ptr[-1] + count)
return values, col_indices, row_ptr
# Example usage
dense_matrix = [
[0, 1, 0, 0],
[1, 0, 1, 1],
[0, 1, 0, 1],
[0, 1, 1, 0]
]
values, col_indices, row_ptr = dense_to_csr(dense_matrix)
print("Values:", values)
print("Column Indices:", col_indices)
print("Row Pointer:", row_ptr)