Is there any way to remove a dataset from an hdf5 file, preferably using h5py? Or alternatively, is it possible to overwrite a dataset while keeping the other datasets intact?
To my understanding, h5py can read/write hdf5 files in 5 modes
f = h5py.File("filename.hdf5",'mode')
where mode can be r
for read, r+
for read-write, a
for read-write but creates a new file if it doesn't exist, w
for write/overwrite, and w-
which is same as w
but fails if file already exists. I have tried all but none seem to work.
Any suggestions are much appreciated.
I do not understand what has your question to do with the file open modes. For read/write r+ is the way to go.
To my knowledge, removing is not easy/possible, in particular no matter what you do the file size will not shrink.
But overwriting content is no problem
Yes, this can be done.
You will need to have the file open in a writeable mode, for example append (as above) or write.
As noted by @seppo-enarvi in the comments the purpose of the previously recommended
f.__delitem__(datasetname)
function is to implement thedel
operator, so that one can delete a dataset usingdel f[datasetname]
I tried this out and the only way I could actually reduce the size of the file is by copying everything to a new file and just leaving out the dataset I was not interested in: