“ValueError: chunksize cannot exceed dimension siz

2019-07-14 23:39发布

问题:

I got the following error when trying to write a xarray object to netcdf file:

"ValueError: chunksize cannot exceed dimension size"  

The data is too big for my memory and needs to be chunked.
The routine is basically as follows:

import xarray as xr  
ds=xr.open_dataset("somefile.nc",chunks={'lat':72,'lon':144}  
myds=ds.copy()
#ds is 335 (time) on 720 on 1440 and has variable var  
def some_function(x):
  return x*2
myds['newvar']=xr.DataArray(np.apply_along_axis(some_function,0,ds['var']))  
myds.drop('var')  
myds.to_netcdf("somenewfile.nc")

So basically, I just manipulate the content and rewrite. Nevertheless, chunks seem to be bad. Same with rechunking to one array. I neither can rewrite ds. Any idea how to track the error down or solve this?

netCDF4 version is 1.2.4
xarray (former xray) version is 0.8.2
dask version is 0.10.1

回答1:

It was an issue of the engine in the writing command. You need to change the engine from netcdf4 (default) to scipy if you are using chunks!

myds.to_netcdf("somenewfile.nc",engine='scipy')

The netcdf4 package is NOT capable of writing such files.