可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have several large rasters that I want to process in a PCA (to produce summary rasters).
I have seen several examples whereby people seem to be simply calling prcomp or princomp. However, when I do this, I get the following error message:
Error in as.vector(data): no method for coercing this S4 class to a vector
Example code:
files<-list.files() # a set of rasters
layers<-stack(files) # using the raster package
pca<-prcomp(layers)
I have tried using a raster brick instead of stack but that doesn't seem to the issue. What method do I need to provide the command so that it can convert the raster data to vector format? I understand that there are ways to sample the raster and run the PCA from that, but I would really like to understand why the above method is not working.
Thanks!
回答1:
Answer to my own question: I ended up doing something slightly different: rather than using every raster cell as input (very large dataset), I took a sample of points, ran the PCA and then saved the output model so that I could make predictions for each grid cell…maybe not the best solution but it works:
rasters <- stack(myRasters)
sr <- sampleRandom(rasters, 5000) # sample 5000 random grid cells
# run PCA on random sample with correlation matrix
# retx=FALSE means don't save PCA scores
pca <- prcomp(sr, scale=TRUE, retx=FALSE)
# write PCA model to file
dput(pca, file=paste("./climate/", name, "/", name, "_pca.csv", sep=""))
x <- predict(rasters, pca, index=1:6) # create new rasters based on PCA predictions
回答2:
There is rasterPCA
function in RStoolbox
package http://bleutner.github.io/RStoolbox/rstbx-docu/rasterPCA.html
For example:
library('raster')
library('RStoolbox')
rasters <- stack(myRasters)
pca1 <- rasterPCA(rasters)
pca2 <- rasterPCA(rasters, nSamples = 5000) # sample 5000 random grid cells
pca3 <- rasterPCA(rasters, norm = FALSE) # without normalization
回答3:
The above method is not working simply because prcomp does not know how to deal with a raster object. It only knows how to deal with vectors, and coercing to vector does not work, hence the error.
What you need to do is read each of your files into a vector, and put each of the rasters in a column of a matrix. Each row will then be a time series of values at a single spatial location, and each column will be all the pixels at a certain time step. Note that the exact spatial coordinates are not needed in this approach. This matrix serves as the input of prcomp
.
Reading the files can be done using readGDAL
, and using as.data.frame
to cast the spatial data to data.frame.
回答4:
here is a working solution:
library(raster)
filename <- system.file("external/rlogo.grd", package="raster")
r1 <- stack(filename)
pca<-princomp(r1[], cor=T)
res<-predict(pca,r1[])
Display result:
r2 <- raster(filename)
r2[]<-res[,1]
plot(r2)
回答5:
Yet another option would be to extract the vales from the raster-stack, i.e.:
rasters <- stack(my_rasters)
values <- getValues(rasters)
pca <- prcomp(values, scale = TRUE)
回答6:
Here is another approach that expands on the getValues approach proposed by @Daniel. The result is a raster stack. The index (idx) references non-NA positions so that NA values are accounted for.
library(raster)
r <- stack(system.file("external/rlogo.grd", package="raster"))
r.val <- getValues(r)
idx <- which(!is.na(r.val))
pca <- princomp(r.val, cor=T)
ncomp <- 2 # first two principle components
r.pca <- r[[1:ncomp]]
for(i in 1:ncomp) { r.pca[[i]][idx] <- pca$scores[,i] }
plot(r.pca)