R: initialise empty dgCMatrix given by matrix mult

2019-06-09 04:30发布

问题:

I have for loop like this, trying to implement the solution here, with dummy vars such that

aaa <- DFM %*% t(DFM)   #DFM is Quanteda dfm-sparse-matrix
for(i in 1:nrow(aaa)) aaa[i,] <- aaa[i,][order(aaa[i,], decreasing = TRUE)]

but now

for(i in 1:nrow(mmm)) mmm[i,] <- aaa[i,][order(aaa[i,], decreasing = TRUE)]

where mmm does not exist yet, the goal is to do the same thing as mmm <- t(apply(a, 1, sort, decreasing = TRUE)). But now before the for loop I need to initialise the mmm otherwise Error: object 'mmm' not found. The type of aaa and mmm is dgCMatrix given by the matrix multiplication of two Quanteda DFM matrices.

Structure

aaaFunc is given by the matrix multiplication DFM %*% t(DFM) where DFM is the Quanteda Sparse dfm-matrix. The structure is such that

> str(aaaFunc)
Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  ..@ i       : int [1:39052309] 0 2 1 0 2 2616 2880 3 4 5 ...
  ..@ p       : int [1:38162] 0 2 3 7 8 10 13 15 16 96 ...
  ..@ Dim     : int [1:2] 38161 38161
  ..@ Dimnames:List of 2
  .. ..$ : chr [1:38161] "90120000" "90120000" "90120000" "86140000" ...
  .. ..$ : chr [1:38161] "90120000" "90120000" "90120000" "86140000" ...
  ..@ x       : num [1:39052309] 1 1 1 1 2 1 1 1 2 1 ...
  ..@ factors : list()

ERRORS on the DFM with the methods mentioned here on general question on replicating a R object without its content but its structure/etc.

A. error with aaaFunc.mt[]<- NA

> aaaFunc.mt <- aaaFunc[0,]; aaaFunc.mt[] <- NA; aaaFunc.mt[1,]
Error in intI(i, n = x@Dim[1], dn[[1]], give.dn = FALSE) : index larger than maximal 0

B. error with mySparseMatrix.mt[nrow(mySparseMatrix),]<-

> aaaFunc.mt <- aaaFunc[0,]; aaaFunc.mt[nrow(aaaFunc),] <- NA
Error in intI(i, n = di[margin], dn = dn[[margin]], give.dn = FALSE) : 
  index larger than maximal 0

C. error with replace(...,NA)

Browse[2]> mmmFunc <- replace(aaaFunc,NA);
Error in replace(aaaFunc, NA) : 
  argument "values" is missing, with no default
Browse[2]> mmmFunc <- replace(aaaFunc,,NA);
Error in `[<-`(`*tmp*`, list, value = NA) : 
  argument "list" is missing, with no default
Browse[2]> mmmFunc <- replace(aaaFunc,c(),NA);
Error in .local(x, i, j, ..., value) : 
  not-yet-implemented 'Matrix[<-' method

How do you initialise empty dgCMatrix given by the matrix multiplication of two Quanteda DFM matrices?

回答1:

The following will either initialize an empty sparse matrix or reset an existing sparse matrix while preserving both the dimensions and dimnames

library(Matrix)

i <- c(1,3:8)
j <- c(2,9,6:10)
x <- 7 * (1:7)
A <- sparseMatrix(i, j, x = x)
rownames(A) <- letters[seq_len(nrow(A))]

A2 <- sparseMatrix(i = integer(0), j = integer(0), dim = A@Dim, dimnames = A@Dimnames)

A@i <- integer(0)
A@p[] <- 0L
A@x <- numeric(0)

setequal(A, A2)
[1] TRUE