I would like to write a netCDF file using R with 'unlimited' dimensions that I can later extend.
This is what I have tried:
Create a netcdf file
library(ncdf4)
## define lat, lon time dimensions
lat <- ncdim_def("latitude", "degrees_east", vals = 44.0, unlim = TRUE)
lon <- ncdim_def("longitude", "degrees_north", vals = -88.5, unlim = TRUE)
time <- ncdim_def("time", "days since 0000-01-01", 1:1000)
## define data with these dimensions
x <- ncvar_def("myvar", units = "m2", dim = list(lat, lon, time))
## create, write to, close nc file
nc <- nc_create(filename = "tmp.nc", vars = list(x))
ncvar_put(nc = nc, varid = x, vals = 1:1000)
nc_close(nc = nc)
I want to add a new vector at a different lat and lon
## reopen existing file
nc <- nc_open("tmp.nc", write = TRUE)
## define new lat, lon dimensions (keep time dim from above)
lat2 <- ncdim_def("latitude", "degrees_east", vals = 44.5, unlim = TRUE)
lon2 <- ncdim_def("longitude", "degrees_north", vals = -89.0, unlim = TRUE)
## define, write new dataset at new lat lon coordinates
x2 <- ncvar_def("myvar", units = "m2", dim = list(lat2, lon2, time))
ncvar_put(nc = nc, varid = x2, vals = 11:1011)
I would expect to find the two different latitudes and longitudes
ncvar_get(nc, 'latitude')
ncvar_get(nc, 'longitude')
ncvar_get(nc, 'myvar')
These show that the file was written using the first set of lat/lon and 'myvar' values, but was not appended with the new set of values.
What am I doing wrong?
I know that the ability to have multiple unlimited dimensions, and to add to them, is a feature of netCDF-4. How can I do this in R?
I recognize that I must be confusing the 'dimension definition' with some other concept. But I am a bit lost.
Yes, I think you are confusing the 'dimension definition' and the actual data within the dimension variable.
If you run your first snippet of code and then dump the NetCDF file using ncdump
, you'll see:
netcdf tmp {
dimensions:
latitude = UNLIMITED ; // (1 currently)
longitude = UNLIMITED ; // (1 currently)
time = 1000 ;
variables:
double latitude(latitude) ;
latitude:units = "degrees_east" ;
latitude:long_name = "latitude" ;
double longitude(longitude) ;
longitude:units = "degrees_north" ;
longitude:long_name = "longitude" ;
int time(time) ;
time:units = "days since 0000-01-01" ;
time:long_name = "time" ;
float myvar(time, longitude, latitude) ;
myvar:units = "m2" ;
data:
latitude = 44 ;
longitude = -88.5 ;
time = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
...
990, 991, 992, 993, 994, 995, 996, 997, 998, 999, 1000 ;
myvar =
{{1}},
{{2}},
{{3}},
...
{{1000}} ;
}
The dimensions are saying latitude
and longitude
are unlimited while the time
dimension is fixed at 1000 points/days since 0000-01-01. This is exactly what you specified, which is good.
So to add another latitude and longitude. I would open the file again, read in the current data, append to it and then write it back.
library(ncdf4)
nc <- nc_open("tmp.nc", write = TRUE)
lat <- ncvar_get(nc, varid='latitude')
lat <- append(lat, 44.5)
ncvar_put(nc, varid='latitude', vals=lat, start=c(1), count=2)
nc_close(nc)
Now ncdump
will show you two latitudes:
data:
latitude = 44, 44.5 ;
longitude = -88.5 ;
Of course for large datasets you do not need or want to read in all the data and the append, you can just tell NetCDF where you want it written.
library(ncdf4)
nc <- nc_open("tmp.nc", write = TRUE)
lon = -89.0
ncvar_put(nc, varid='longitude', vals=lon, start=c(2), count=1)
nc_close(nc)
Now ncdump
will show you two latitudes and two longitudes:
data:
latitude = 44, 44.5 ;
longitude = -88.5, -89 ;
The data represents of myvar
is a 3D array, so I would have done the initial write different. I would have specified it's dimensions when creating the data and when writing it to the file, like this:
data <- array(1:1000, c(1,1,1000))
ncvar_put(nc = nc, varid='myvar', vals=data, start=c(1,1,1), count=c(1,1,1000))
Then to append to the second latitude and longitude:
data <- array(11:1011, c(1,1,1000))
ncvar_put(nc = nc, varid='myvar', vals=data, start=c(2,2,1), count=c(1,1,1000))
NOTE
I feel the R package hides too much from you. When you create a dimension with ncdim_def
, you can give it values. This is in my mind more of a 3 step process.
- Create the dimension.
- Create a variable associated with that dimension.
- Add data to this variable.
Hope this helps.