I would like to extract a spatial subset of a rather large netcdf file. From Loop through netcdf files and run calculations - Python or R
from pylab import *
import netCDF4
f = netCDF4.MFDataset('/usgs/data2/rsignell/models/ncep/narr/air.2m.1989.nc')
# print variables
f.variables.keys()
atemp = f.variables['air'] # TODO: extract spatial subset
How do I extract just the subset of netcdf file corresponding to a state (say Iowa). Iowa has following boundary lat lon:
Longitude: 89° 5' W to 96° 31' W
Latitude: 40° 36' N to 43° 30' N
Well this is pretty easy, you have to find the index for the upper and lower bound in latitude and longitude. You can do it by finding the value that is closest to the ones you're looking for.
Then just subset the variable array.
Small change needs to be made to the lonbounds part (data are degrees east), because the longitude value ranges from 0 to 359 in the data, so negative numbers will not work in this case. Also the calculation for latli and latui needs to be switched because the value goes from north to south, 89 to -89.
Favo's answer works (I assume; haven't checked). A more direct and idiomatic way is to use numpy's where function to find the necessary indices.
To mirror the response from N1B4, you can also do it on one line with climate data operators (cdo):
Thus to loop over a set of file, I would do this in a BASH script, using cdo to process each file and then calling your python script:
I always try and do my file processing "offline" as I find it less prone to error. cdo is an alternative to ncks, I'm not saying it is better, I just find it easier to remember the commands. nco in general is more powerful and I resort to it when cdo can't perform the task I wish to carry out.
Note that this can be accomplished even quicker on the command line using NCO's ncks.
ncks -v air -d latitude,40.,43. -d longitude,-89.,-96. infile.nc -O subset_infile.nc
If you like pandas, then you should think about checking out xarray.