Compressed Sparse Row
Personalized latent space expansion for compressed sparse row

Compressed Sparse Row

The Compressed Sparse Row (CSR) format is a common way to store sparse matrices in a space-efficient manner. This format is particularly useful for storing matrices where most of the elements are zero, such as adjacency matrices for graphs with few edges.

An adjacency matrix for a graph G(V,E) with V vertices and E edges is a V×V matrix A where A[i][j]=1 if there is an edge from vertex i to vertex j and A[i][j]=0 otherwise. If the graph is undirected, the matrix is symmetric.

Compressed Sparse Row (CSR) Format

The CSR format uses three one-dimensional arrays to represent a sparse matrix:

  1. Value Array: Stores all non-zero values in the matrix.
  2. Column Index Array: Stores the column indices corresponding to the values in the value array.
  3. Row Pointer Array: Stores the starting index for each row in the value array.

For example, let's consider an adjacency matrix for a graph with 4 vertices:


CSR Format

  1. Value Array: [1, 1, 1, 1, 1, 1, 1, 1] (Stores all non-zero elements in a single array)
  2. Column Index Array: [1, 0, 2, 3, 1, 3, 1, 2] (The column indices for each non-zero element)
  3. Row Pointer Array: [0, 1, 4, 6, 8] (The index in the value array where each row starts; the last element should be the number of non-zero elements)

The Row Pointer Array starts with 0 (indicating the start position in the Value array for row 0). The ith element of the Row Pointer Array points to the index in the Value Array where the ith row starts. The last element in the Row Pointer Array is the total number of non-zero elements in the matrix.

These arrays provide a more space-efficient way to store and manipulate sparse matrices, making operations like matrix-vector multiplication more efficient compared to using a dense representation.

Here's a simple Python code snippet that converts a given dense matrix (2D list) into its Compressed Sparse Row (CSR) format. The CSR format consists of three arrays:

  • values: Contains all the non-zero values.
  • col_indices: Contains the column indices for each non-zero value.
  • row_ptr: Contains the starting index for each row in the values array.

def dense_to_csr(matrix):
    values = []
    col_indices = []
    row_ptr = [0]

    for row in matrix:
        count = 0
        for col, val in enumerate(row):
            if val != 0:
                values.append(val)
                col_indices.append(col)
                count += 1
        row_ptr.append(row_ptr[-1] + count)

    return values, col_indices, row_ptr

# Example usage
dense_matrix = [
    [0, 1, 0, 0],
    [1, 0, 1, 1],
    [0, 1, 0, 1],
    [0, 1, 1, 0]
]

values, col_indices, row_ptr = dense_to_csr(dense_matrix)
print("Values:", values)
print("Column Indices:", col_indices)
print("Row Pointer:", row_ptr)
        

要查看或添加评论,请登录

Yeshwanth Nagaraj的更多文章

社区洞察

其他会员也浏览了