I am trying to create a NetCDF from a .csv file. I have read several tutorials here and other places and still have some doubts.
I have a table according to this:
lat,long,time,rh,temp
41,-109,6,1,1
40,-107,18,2,2
39,-105,6,3,3
41,-103,18,4,4
40,-109,6,5,2
39,-107,18,6,4
I create the NetCDF using the ncdf4 package in R.
xvals <- data$lon
yvals <- data$lat
nx <- length(xvals)
ny <- length(yvals)
lon1 <- ncdim_def("longitude", "degrees_east", xvals)
lat2 <- ncdim_def("latitude", "degrees_north", yvals)
time <- data$time
mv <- -999 #missing value to use
var_temp <- ncvar_def("temperatura", "celsius", list(lon1, lat2, time), longname="Temp. da superfície", mv)
var_rh <- ncvar_def("humidade", "%", list(lon1, lat2, time), longname = "humidade relativa", mv )
ncnew <- nc_create(filename, list(var_temp, var_rh))
ncvar_put(ncnew, var_temp, dadostemp, start=c(1,1,1), count=c(nx,ny,nt))
When I follow the procedure it states that the NC expects 3 times the number of data that I have. I understand why, one matrix for each dimension, since I stated that the variables are according to the Longitude, Latitude and Time.
So, how would I import this kind of data, where I already have one Lon, Lat, Time and other variables for each data acquisition?
Could someone shed some light?
PS: The data used here is not my real data, just some example I was using for the tutorials.
I think there is more than one problem in your code. Step by step:
Create dimensions
In a nc file dimensions don't work as key-values there just a vector of values defining what each position in a variable array means. This means you should create your dimensions like this:
Where I work we use unlimited dimensions as mere indexes while a 1d-variable with same name as the dimension holds the values. I'm not sure how unlimited dimensions work in R. Since you don't ask for it I leave this out :-)
define variables
add data
Create an nc file:
ncnew <- nc_create(f, list(var_temp, var_rh))
When adding values the object holding the data is molten to a 1d-array and a sequential write is started at the position specified by start. The dimension to write along is controlled by the values in count. If you have data like this:
The command
ncvar_put(ncnew, var_temp,data$t,count=c(2,2,1))
would give you what you (probably) expect.For you're data the first step is to create the indexes for the dimensions:
Then create an array with the dimensions appropriate for your data:
Then fill the array with you're values:
if speed is a concern you could calculate the linear index vectorised and use this for value assignment.
Write the data
Note that you don't need
start
andcount
.Finally close the nc file to write data to the disk
nc_close(ncnew)
Optionally I would recommend you thencdump
console command to check your file.Edit
Regarding your question whether to write a complete array or use
start
andcount
I believe both methods work reliable. Which one to prefer depends on your data and you're personal preferences.I think the method of building an array, add the values and then write it as whole is easier to understand. However, when asking what is more efficient it depends on the data. If you're data is big and has many NA values I believe using multiple writes with start and count could be faster. If NA's are rare creating one matrix and do single write would be faster. If you're data is so big creating an extra array would exceed you're available memory you have to combine both methods.