I'm trying to create a sparse vector from a series of arrays where there are some overlapping indexes. For a matrix there's a very convenient object in scipy that does exactly this:
coo_matrix((data, (i, j)), [shape=(M, N)])
So if data happens to have repeated elements (because their i,j indexes are the same), those are summed up in the final sparse matrix. I was wondering if it would be possible to do something similar but for sparse vectors, or do I have just to use this object and pretend it's a 1-column matrix?
While you might be able to reproduce a 1d equivalent, it would save a lot of work to just work with a 1 row (or 1 col) sparse matrix. I am not aware of any sparse vector package for
numpy
.The
coo
format stores the input arrays exactly as you given them, without the summing. The summing is done when it is displayed or (otherwise) converted to acsc
orcsr
format. And since thecsr
constructor is compiled, it will to that summation faster than anything you could code in Python.Construct a '1d' sparse coo matrix
Look at its data representation (no summation)
look at the array representation (shape
(1,10)
)and the csr equivalent.
nonzero
shows the same pattern:M.toarray().flatten()
will give you the(10,)
1d array.