scipy.sparse的摘选、错误修正和总结_sparse matrix csc to csr-优快云博客

1、COO_Matrix

不难发现，coo_matrix是可以根据行和列索引进行data值的累加。

>>> row  = np.array([0, 0, 1, 3, 1, 0, 0])
>>> col  = np.array([0, 2, 1, 3, 1, 0, 0])
>>> data = np.array([1, 1, 1, 1, 1, 1, 1])
>>> coo_matrix((data, (row, col)), shape=(4, 4)).toarray()
array([[3, 0, 1, 0],
       [0, 2, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 1]])

2、CSC_Matrix和CSR_Matrix

csr_matrix是按行对矩阵进行压缩的，csc_matrix则是按列对矩阵进行压缩的。通过row_offsets,column_indices，data来确定矩阵。column_indices，data与coo格式的列索引与数值的含义完全相同，row_offsets表示元素的行偏移量。

>>> indptr = np.array([0, 2, 3, 6])
>>> indices = np.array([0, 2, 2, 0, 1, 2])
>>> data = np.array([1, 2, 3, 4, 5, 6])
>>> csr_matrix((data, indices, indptr), shape=(3, 3)).toarray()
array([[1, 0, 2],
       [0, 0, 3],
       [4, 5, 6]])

在csr_matrix中，indices代表这些数据对应的是哪一列，而data代表这一列对应的数据。而indptr的作用是对每一个i in range(indptr), data[indptr[i]:indptr[i+1]]属于第i行。这表明data中[0:2]（即前两个数）属于第0行，[2:3]（即第三个数）属于第1行，......csc_matrix同理，只不过indices和indptr交换行列。