I got the following error when trying to write a xarray object to netcdf file:
"ValueError: chunksize cannot exceed dimension size"
The data is too big for my memory and needs to be chunked.
The routine is basically as follows:
import xarray as xr
ds=xr.open_dataset("somefile.nc",chunks={'lat':72,'lon':144}
myds=ds.copy()
#ds is 335 (time) on 720 on 1440 and has variable var
def some_function(x):
return x*2
myds['newvar']=xr.DataArray(np.apply_along_axis(some_function,0,ds['var']))
myds.drop('var')
myds.to_netcdf("somenewfile.nc")
So basically, I just manipulate the content and rewrite. Nevertheless, chunks seem to be bad. Same with rechunking to one array. I neither can rewrite ds. Any idea how to track the error down or solve this?
netCDF4 version is 1.2.4
xarray (former xray) version is 0.8.2
dask version is 0.10.1
It was an issue of the engine in the writing command. You need to change the engine from netcdf4 (default) to scipy if you are using chunks!
The netcdf4 package is NOT capable of writing such files.