Combining PyDIP with NumPy
Contents
The dip.Image
object uses the Python buffer protocol, which makes it very easy to mix
PyDIP with NumPy and other image processing packages such as scikit-image or OpenCV,
which use NumPy arrays to represent their images.
Images and NumPy arrays
A NumPy array can be passed instead of an image to any DIPlib function. In fact, any Python object that uses the buffer interface implicitly casts to an image. The reverse is also true: NumPy treats DIPlib images as an array, you can call any NumPy function on an image. However, some code that accepts a NumPy array calls methods of the array, which would not be defined for a DIPlib image. For example,
arr = np.zeros((10, 11)) dip.Gauss(arr) # OK img = dip.Image((11, 10)) np.amax(img) # OK img.max() # error! np.array method not defined for dip.Image img.shape # error! np.array property not defined for dip.Image
One can “cast” from a NumPy array to a DIPlib image and back, without copying the data:
x = np.asarray(img) y = dip.Image(arr)
The image and the array point to the same memory in these two cases: modifying values in the one cause the other to see the modified values as well.
Note that we used np.asarray()
, not np.array()
. The latter would have copied the data, unless the
copy
argument is set to False
: np.array(img, copy=False)
is the same as np.asarray(img)
.
Dimension ordering
But do note the different interpretation of dimensions and ranges in NumPy and PyDIP, see Indexing:
- DIPlib reverses the dimensions from how they are treated by NumPy, with the first index being the x coordinate, the second y, etc. In NumPy, and consequently in libraries such as scikit-image or OpenCV, the x coordinate is always the last index.
- Ranges are always inclusive:
img[0:10]
represents 11 samples in DIPlib, not 10 as in NumPy, becauseimg[10]
is part of the range.
Thus, the following indexing operations are identical:
arr = np.zeros((10, 11, 12)) img = dip.Image(arr) arr[1, 2, 3] == img[3, 2, 1]
By reversing the indexing, we map an image with normal strides to an array in NumPy’s standard C-ordering.
The following evaluates to True
:
dip.Image( np.zeros((10,11,5,7)) ).HasNormalStrides()
When using a NumPy array as an image in a DIPlib function, it is implicitly cast to a dip.Image
object as above, and passed to the DIPlib function. This means that, whether the input is a NumPy
array or a DIPlib image, other function parameters that identify dimensions are always interpreted
in the same way. For example, the filter sizes are ordered (x, y, z), not (z, y, x) as they would be
ordered in scikit-image or other Python imaging libraries.
By calling dip.ReverseDimensions()
(which one should do only directly after loading the diplib
module to avoid confusing results), PyDIP is configured to reverse dimensions of all DIPlib images.
This means that the NumPy indexing order will be preserved, images will be indexed as img[z,y,x]
.
This has several surprising results, for example the direction of all angles is reversed, with
positive angles being counter-clockwise instead of clockwise. This option is intended to make it
easier to mix DIPlib functions into code that also uses e.g. scikit-image.
The tensor dimension
When casting a tensor image to a NumPy array, the tensor dimension will become the last array dimension.
When casting a NumPy array to a DIPlib image, there is no information about which dimension, if
any, is the tensor dimension. By default, the following heuristic is used: if the array has more than
two dimensions, and if the smaller of the last or the first array dimension has no more than 4 elements,
then that dimension will be the tensor dimension. The tensor will have a column vector shape (this is the
default tensor shape in DIPlib). The threshold of 4 was picked because it will handle correctly all color
images. This threshold can be adjusted using dip.SetTensorConversionThreshold()
. If set to 0, all arrays
will be converted to a scalar image.
For example, here img
is a 2D image of 11x10 pixels, and three samples per pixel:
arr = np.zeros((10, 11, 3)) img = dip.Image(arr)
The dip.Image()
constructor with a NumPy array as input takes an optional argument that determines
which axis, if any, is the tensor dimension. Add None
to force a scalar output image:
img2 = dip.Image(arr, None) img3 = dip.Image(arr, tensor_axis=0)
Here, img2
is a 3D scalar image, and img3
is a 3D image with 3x11 pixels, and 10 samples per pixel.
If the NumPy array represents a color image, you will have to explicitly set this information in
the new dip.Image
object:
img.SetColorSpace('sRGB')
Calling PyDIP functions with an array as output
The out
keyword argument to PyDIP functions is easier to use with a dip.Image
object than with a np.array
object. When passing in an image object, the image can be reforged (meaning its data segment can be reallocated,
changing the size, number of tensor elements, and/or data type). The same is not true for an input of a different
type, which will be converted to a protected dip.Image
object (see The “protect” flag).
This means that it must have the right sizes to receive the output of the function.
arr = np.zeros(10) dip.Gauss(img, out=arr) # error! the output has the wrong sizes, and cannot be reforged
This will raise a dip.ParameterError
: “Image is protected”. We need to make the array have the right sizes
for this to work:
arr = np.zeros((img.Size(1), img.Size(0), img.TensorElements())) dip.Gauss(img, out=arr)
Note that the filter is computed in the output type, 64-bit float in this case.
This is most useful to work in-place, where the feature is easy to use:
dip.Gauss(arr, out=arr)