1. sparse模块的官方document地址:http://docs.scipy.org/doc/scipy/reference/sparse.html
2. sparse matrix的存储形式有很多种,见此帖子http://blog.csdn.net/anshan1984/article/details/8580952
不同的存储形式在sparse模块中对应如下:
bsr_matrix(arg1[, shape, dtype, copy, blocksize]) Block Sparse Row matrix
coo_matrix(arg1[, shape, dtype, copy]) A sparse matrix in COOrdinate format.
csc_matrix(arg1[, shape, dtype, copy]) Compressed Sparse Column matrix
csr_matrix(arg1[, shape, dtype, copy]) Compressed Sparse Row matrix
dia_matrix(arg1[, shape, dtype, copy]) Sparse matrix with DIAgonal storage
dok_matrix(arg1[, shape, dtype, copy]) Dictionary Of Keys based sparse matrix.
lil_matrix(arg1[, shape, dtype, copy]) Row-based linked list sparse matrix
3. 要将普通的非稀疏矩阵变为相应存储形式的稀疏矩阵只要如下:(以coo_matrix为例)
A = coo_matrix([[1,2],[3,4]])
或者按照相应存储形式的要求,喂给参数,构建矩阵,以coo为例:
>>> row = np.array([0,0,1,3,1,0,0])
>>> col = np.array([0,2,1,3,1,0,0])
>>> data = np.array([1,1,1,1,1,1,1])
>>> coo_matrix((data, (row,col)), shape=(4,4)).todense()
matrix([[3, 0, 1, 0],
[0, 2, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 1]])
4. hstack和vstack函数可以将稀疏矩阵横向或者纵向合并,比如:
>>> from scipy.sparse import coo_matrix, vstack
>>> A = coo_matrix([[1,2],[3,4]])
>>> B = coo_matrix([[5,6]])
>>> vstack( [A,B] ).todense()
matrix([[1, 2],
[3, 4],
[5, 6]])
但是经过测试,如果A和B的数据形式不一样,不能合并。比如A存储的是字符串,B是数字,那么不能合并。也就是说一个矩阵中的数据格式必须是相同的。