Creating a NumPy array directly from __array_inter

2020-04-08 11:45发布

问题:

Suppose I have an __array_interface__ dictionary and I would like to create a numpy view of this data from the dictionary itself. For example:

buff = {'shape': (3, 3), 'data': (140546686381536, False), 'typestr': '<f8'}
view = np.array(buff, copy=False)

However, this does not work as np.array searches for either the buffer or array interface as attributes. The simple workaround could be the following:

class numpy_holder(object):
    pass

holder = numpy_holder()
holder.__array_interface__ = buff
view = np.array(holder, copy=False)

This seems a bit roundabout. Am I missing a straightforward way to do this?

回答1:

correction - with the right 'data' value your holder works in np.array:

np.array is definitely not going to work since it expects an iterable, some things like a list of lists, and parses the individual values.

There is a low level constructor, np.ndarray that takes a buffer parameter. And a np.frombuffer.

But my impression is that x.__array_interface__['data'][0] is a integer representation of the data buffer location, but not directly a pointer to the buffer. I've only used it to verify that a view shares the same databuffer, not to construct anything from it.

np.lib.stride_tricks.as_strided uses __array_interface__ for default stride and shape data, but gets the data from an array, not the __array_interface__ dictionary.

===========

An example of ndarray with a .data attribute:

In [303]: res
Out[303]: 
array([[ 0, 20, 50, 30],
       [ 0, 50, 50,  0],
       [ 0,  0, 75, 25]])
In [304]: res.__array_interface__
Out[304]: 
{'data': (178919136, False),
 'descr': [('', '<i4')],
 'shape': (3, 4),
 'strides': None,
 'typestr': '<i4',
 'version': 3}
In [305]: res.data
Out[305]: <memory at 0xb13ef72c>
In [306]: np.ndarray(buffer=res.data, shape=(4,3),dtype=int)
Out[306]: 
array([[ 0, 20, 50],
       [30,  0, 50],
       [50,  0,  0],
       [ 0, 75, 25]])
In [324]: np.frombuffer(res.data,dtype=int)
Out[324]: array([ 0, 20, 50, 30,  0, 50, 50,  0,  0,  0, 75, 25])

Both of these arrays are views.

OK, with your holder class, I can make the same thing, using this res.data as the data buffer. Your class creates an object exposing the array interface.

In [379]: holder=numpy_holder()
In [380]: buff={'data':res.data, 'shape':(4,3), 'typestr':'<i4'}
In [381]: holder.__array_interface__ = buff
In [382]: np.array(holder, copy=False)
Out[382]: 
array([[ 0, 20, 50],
       [30,  0, 50],
       [50,  0,  0],
       [ 0, 75, 25]])


回答2:

Here's another approach:

import numpy as np


def arr_from_ptr(pointer, typestr, shape, copy=False,
                 read_only_flag=False):
    """Generates numpy array from memory address
    https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.interface.html

    Parameters
    ----------
    pointer : int
        Memory address

    typestr : str
        A string providing the basic type of the homogenous array The
        basic string format consists of 3 parts: a character
        describing the byteorder of the data (<: little-endian, >:
        big-endian, |: not-relevant), a character code giving the
        basic type of the array, and an integer providing the number
        of bytes the type uses.

        The basic type character codes are:

        - t Bit field (following integer gives the number of bits in the bit field).
        - b Boolean (integer type where all values are only True or False)
        - i Integer
        - u Unsigned integer
        - f Floating point
        - c Complex floating point
        - m Timedelta
        - M Datetime
        - O Object (i.e. the memory contains a pointer to PyObject)
        - S String (fixed-length sequence of char)
        - U Unicode (fixed-length sequence of Py_UNICODE)
        - V Other (void * – each item is a fixed-size chunk of memory)

        See https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.interface.html#__array_interface__

    shape : tuple
        Shape of array.

    copy : bool
        Copy array.  Default False

    read_only_flag : bool
        Read only array.  Default False.
    """
    buff = {'data': (pointer, read_only_flag),
            'typestr': typestr,
            'shape': shape}

    class numpy_holder():
        pass

    holder = numpy_holder()
    holder.__array_interface__ = buff
    return np.array(holder, copy=copy)

Usage:

# create array
arr = np.ones(10)

# grab pointer from array
pointer, read_only_flag = arr.__array_interface__['data']

# constrct numpy array from an int pointer
arr_out = arr_from_ptr(pointer, '<f8', (10,))

# verify it's the same data
arr[0] = 0
assert np.allclose(arr, arr_out)