I am writing Python code to accelerate a region properties function for labeled objects in a binary image. The following code will calculate the number of border pixels of a labeled object in a binary image given the indices of the object. The main() function will cycle through all labeled objects in a binary image 'mask' and calculate the number of border pixels for each one.
I am wondering what the best way is to pass or return my variables in this Cython code. The variables are either in NumPy arrays or typed Memoryviews. I've messed around with passing/returning the variables in the different formats, but cannot deduce what the best/most efficient way is. I am new to Cython so Memoryviews are still fairly abstract to me and whether there is a different between the two methods remains a mystery. The images I am working with contain 100,000+ labeled objects so operations such as these need to be fairly efficient.
To summarize:
When/should I pass/return my variables as typed Memoryviews rather than NumPy arrays for very repetitive computations? Is there a way that is best or are they exactly the same?
%%cython --annotate
import numpy as np
import cython
cimport numpy as np
DTYPE = np.intp
ctypedef np.intp_t DTYPE_t
@cython.boundscheck(False)
@cython.wraparound(False)
def erode(DTYPE_t [:,:] img):
# Image dimensions
cdef int height, width, local_min
height = img.shape[0]
width = img.shape[1]
# Padded Array
padded_np = np.zeros((height+2, width+2), dtype = DTYPE)
cdef DTYPE_t[:,:] padded = padded_np
padded[1:height+1,1:width+1] = img
# Eroded image
eroded_np = np.zeros((height,width),dtype=DTYPE)
cdef DTYPE_t[:,:] eroded = eroded_np
cdef DTYPE_t i,j
for i in range(height):
for j in range(width):
local_min = min(padded[i+1,j+1], padded[i,j+1], padded[i+1,j],padded[i+1,j+2],padded[i+2,j+1])
eroded[i,j] = local_min
return eroded_np
@cython.boundscheck(False)
@cython.wraparound(False)
def border_image(slice_np):
# Memoryview of slice_np
cdef DTYPE_t [:,:] slice = slice_np
# Image dimensions
cdef Py_ssize_t ymax, xmax, y, x
ymax = slice.shape[0]
xmax = slice.shape[1]
# Erode image
eroded_image_np = erode(slice_np)
cdef DTYPE_t[:,:] eroded_image = eroded_image_np
# Border image
border_image_np = np.zeros((ymax,xmax),dtype=DTYPE)
cdef DTYPE_t[:,:] border_image = border_image_np
for y in range(ymax):
for x in range(xmax):
border_image[y,x] = slice[y,x]-eroded_image[y,x]
return border_image_np.sum()
@cython.boundscheck(False)
@cython.wraparound(False)
def main(DTYPE_t[:,:] mask, int numobjects, Py_ssize_t[:,:] indices):
# Memoryview of boundary pixels
boundary_pixels_np = np.zeros(numobjects,dtype=DTYPE)
cdef DTYPE_t[:] boundary_pixels = boundary_pixels_np
# Loop through each object
cdef Py_ssize_t y_from, y_to, x_from, x_to, i
cdef DTYPE_t[:,:] slice
for i in range(numobjects):
y_from = indices[i,0]
y_to = indices[i,1]
x_from = indices[i,2]
x_to = indices[i,3]
slice = mask[y_from:y_to, x_from:x_to]
boundary_pixels[i] = border_image(slice)
return boundary_pixels_np
Memoryviews are a more recent addition to Cython, designed to be an improvement compared to the original
np.ndarray
syntax. For this reason they're slightly preferred. It usually doesn't make too much difference which you use though. Here are a few notes:Speed
For speed it makes very little difference - my experience is that memoryviews as function parameters are marginally slower, but it's hardly worth worrying about.
Generality
Memoryviews are designed to work with any type that has Python's buffer interface (for example the standard library
array
module). Typing asnp.ndarray
only works with numpy arrays. In principle memorviews can support an even wider range of memory layouts which can make interfacing with C code easier (in practice I've never actually seen this be useful).As a return value
When returning an array from Cython to code Python the user will probably be happier with a numpy array than with a memoryview. If you're working with memoryviews you can do either:
Ease of compiling
If you're using
np.ndarray
you have to get the set the include directory withnp.get_include()
in yoursetup.py
file. You don't have to do this with memoryviews, which often means you can skipsetup.py
and just use thecythonize
command line command orpyximport
for simpler projects.Parallelization
This is the big advantage of memoryviews compared to numpy arrays (if you want to use it). It does not require the global interpreter lock to take slices of a memoryview but it does for a numpy array. This means that the following code outline can work in parallel with a memoryview:
If you aren't using Cython's parallel functionality this doesn't apply.
cdef
classesYou can use memoryviews as attributes of
cdef
classes but notnp.ndarray
s. You can (of course) use numpy arrays as untypedobject
attributes instead.