Numpy arrays, being extension types (aka defined using in extensions the C API), declare additional fields outside the scope of the Python interpreter (for example the data
attribute, which is a Buffer Structure
, as documented in Numpy's array interface.
To be able to serialize it, Python 2 used to use the __reduce__
function as part of the pickle protocol, as stated in the doc, and explained here.
But, even if __reduce__
still exists in Python 3, the Pickle protocol
section (and Pickling and unpickling extension types
a fortiori) was removed from the doc, so it is unclear what does what.
Moreover, there are additional entries that relate to pickling extension types:
- copyreg, described as a
Pickle interface constructor registration for extension types
, but there's no mention of extension types in the copyreg module itself. - PEP 3118 -- Revising the buffer protocol which released a new buffer protocol for Python 3. (and maybe automates pickling for this buffer protocol).
- New-style class: One can assume that the new-style classes have an influence on the pickling process.
So, how does all of this relate to Numpy arrays:
- Does Numpy array implement special methods, such as
__reduce__
to inform Python on how to pickle them (orcopyreg
)? Numpy objects still expose a__reduce__
method, but it may be for compatibility reasons. - Does Numpy uses Python's C-API structures that are supported out of the box by Pickle (like the new
buffer protocol
), so nothing supplementary is necessary in order to pickle numpy arrays?