How to concatenate monthly TRMM netCDF files into

2019-04-11 07:50发布

问题:

I have downloaded TRMM monthly precipitation rate in netCDF format from 1998 -2016, so approximately more than 200 files.The names of these files are 3B43.19980101.7.HDF.nc 3B43.19980201.7.HDF.nc 3B43.19980301.7.HDF.nc , and so on. I would like to concatenate all of these files into a single netCDF. I've tried using the NCO operator "ncrcat" which should be able to concatenate a very long series of files along the record dimension, in this case time, but so far no luck. I tried at first simple with only 2 files

ncrcat -O -h 3B43.19980101.7.HDF.nc 3B43.19980201.7.HDF.nc out.nc

got

ERROR: no variable fit criteria for processing

so then I tried

ncks --mk_rec_dmn time 3B43.19980101.7.HDF.nc TD.3B43.19980101.7.HDF.nc
ncks --mk_rec_dmn time 3B43.19980201.7.HDF.nc TD.3B43.19980201.7.HDF.nc

I tried again with

ncrcat -O -h TD.3B43.19980101.7.HDF.nc TD.3B43.19980201.7.HDF.nc out.nc

still got same error

ERROR: no variable fit criteria for processing

Is there an easier way to doing this with 200+ files? A script that I can follow? I am new to all this so please be gentle.

Any help would be greatly appreciated. I am using Windows 7 x86.

回答1:

It is completely possible to do this with NCO. I looked at your input files and they simply lack a time dimension, so ncrcat fails. Add a time dimension with

ncecat -u time in.nc out.nc

Then use ncrcat as you say above. p.s. I have changed the ncrcat and ncra error messages to be more explicit about how to do this. Previously the HINTs only applied to cases where the file already had the dimension, but it was fixed. Your files did not have a time dimension, so the ncks command you issued had no effect.

Edit to show loops:

To do this or anything like it in a loop use a construct like

for fl in `ls trmm*.nc`; do
    ncecat -u time ${fl} ${fl/trmm/trmm_new} # Base output name in input name
    ... # more processing
done

The NCO manual has many examples of using file loops.



回答2:

In R, you can do this by reading in all the data, combining into one large 3d array (latxlonxtime). For example, array[,,1] would be the latxlon grid for Jan 1998. This can then be saved as a .rds format for further use in R, or saved as a netCDF file, which I won't cover but there are tutorials for saving R arrays as .nc files online.

First, make a .csv file that contains a single column of all the filenames you downloaded. One easy way is to ctrl-C the output from typing 'ls' in terminal into an excel sheet. The code below reads in those files one by one, adding each to the array.

library(ncdf4)
library(abind)
filenames=read.csv('TRMM.filenames.csv',head=F) #read in filenames
filenames=as.character(filenames[,1]) #convert to 'character' format

n.lon=192 #input the correct #'s here, must be the same for all files
n.lat=94

NA.matrix=matrix(rep(NA,n.lon*n.lat),nrow=n.lon) #used to initialize
prcp=array(NA.matrix,c(n.lon,n.lat,1)) #n.lonxn.latx1 array of NA's to initialize
for (i in 1:length(filenames)){
  ncdata=nc_open(filenames[i]) #read in file i, assuming files are in same location as filenames.csv/your current working directory
  #ncdata=nc_open(paste(data.dir,filenames[i],sep="")) #if your data is in another directory than the filenames.csv file, you could read it in with this line instead
  nc=ncvar_get(ncdata,"precip") #check the .nc files to see what the variable name actually is; this reads in the variable "precip"
  prcp=abind(prcp,nc)
}
prcp=prcp[,,-1] #remove the NA.matrix used to initialize

dim(prcp) #check that the lonxlatxtime dimensions make sense
saveRDS(prcp,'TRMM.all.rds') #save as .rds file, or proceed to save it as .nc file, which takes a bit more work