Image Processing Guide

Introduction

The basic operations on images are the affine image conversions like image scaling, rotation, and cutting. For most of the operations, two ways of executing the functions exist. The easier API simply returns the processed image, but the second version accepts input and output objects (to allow memory reuse).

Scaling images

To compute a scaled version of the image, simply create the image at the desired scale. For instance, in the example below an image is up-scaled by first creating the image and then initializing the larger image:

>>> A = numpy.array( [ [1, 2, 3], [4, 5, 6] ], dtype = numpy.uint8 ) # A small image of size 2x3
>>> print(A)
[[1 2 3]
 [4 5 6]]
>>> B = numpy.ndarray( (3, 5), dtype = numpy.float64 )               # A larger image of size 3x5

the bob.ip.base.scale() function of Bob is then called to up-scale the image:

>>> bob.ip.base.scale(A, B)
>>> numpy.allclose(B, [[ 1.,1.5, 2., 2.5,  3.],[ 2.5, 3.,3.5, 4., 4.5],[ 4.,4.5, 5., 5.5,6. ]])
True

which bi-linearly interpolates image A to image B. Of course, scaling factors can be different in horizontal and vertical direction:

>>> C = numpy.ndarray( (2, 5), dtype = numpy.float64 )
>>> bob.ip.base.scale(A, C)
>>> numpy.allclose(C, [[1., 1.5,  2., 2.5, 3.],[4., 4.5, 5., 5.5, 6. ]])
True

Rotating images

The rotation of an image is slightly more difficult since the resulting image size has to be computed in advance. To facilitate this there is a function bob.ip.base.rotated_output_shape() which can be used:

>>> A = numpy.array( [ [1, 2, 3], [4, 5, 6] ], dtype = numpy.uint8 ) # A small image of size 3x3
>>> print(A)
[[1 2 3]
 [4 5 6]]
>>> rotated_shape = bob.ip.base.rotated_output_shape( A, 90 )
>>> print(rotated_shape)
(3, 2)

After the creation of the image in the desired size, the bob.ip.base.rotate() function can be executed:

>>> A_rotated = numpy.ndarray( rotated_shape, dtype = numpy.float64 ) # A small image of rotated size
>>> bob.ip.base.rotate(A, A_rotated, 90)      # execute the rotation
>>> numpy.allclose(A_rotated, [[ 3.,  6.],[ 2.,  5.],[ 1.,  4.]])
True

Complex image operations

Complex image operations are usually wrapped up by classes. The usual work flow is to first generate an object of the desired class, specifying parameters that are independent on the images to operate, and to second use the class on images. Usually, objects that perform image operations have the __call__ function overloaded, so that one simply can use it as if it were functions. Below we provide some examples.

Image filtering

One simple example of image filtering is to apply a Gaussian blur filter to an image. This can be easily done by first creating an object of the bob.ip.base.Gaussian class:

>>> filter = bob.ip.base.Gaussian(sigma = (3., 3.), radius = (5, 5))

Now, let’s see what happens to a small test image:

import bob.ip.base
import numpy
import math

# create test image
image = numpy.zeros((21,21))
for i in range(21):
  image[i,i] = 255
  image[-i,i] = 255

# perform Gaussian filtering
gaussian = bob.ip.base.Gaussian(sigma = (3., 3.), radius = (5, 5))
smoothed = gaussian(image)

# plot results
from matplotlib import pyplot
pyplot.figure(figsize=(8,4))
pyplot.subplot(121) ; pyplot.imshow(image, cmap='gray')    ; pyplot.title('Image')
pyplot.subplot(122) ; pyplot.imshow(smoothed, cmap='gray') ; pyplot.title('Smoothed')
pyplot.show()

(Source code, png, hires.png, pdf)

_images/gaussian.png

The image of the cross has now been nicely smoothed.

A second example uses Sobel filters to extract edges from an image. Two types of Sobel filters exist: The vertical filter \(S_y\) and the horizontal filter \(S_x\):

\[\begin{split}S_y = \left\lgroup\begin{array}{ccc} -1 & -2 & -1 \\ 0 & 0 & 0 \\ 1 & 2 & 1 \end{array}\right\rgroup \qquad S_x = \left\lgroup\begin{array}{ccc} -1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1 \end{array}\right\rgroup\end{split}\]

Both filters can be applied at the same time using the bob.ip.base.sobel() function, where the result of \(S_y\) will be put to the first layer and \(S_x\) to the second layer.

>>> image = numpy.zeros((21,21))
>>> image[5:16, 5:16] = 1
>>> sobel = bob.ip.base.sobel(image)
>>> sobel.shape
(2, 21, 21)

Interestingly, the vertical filter \(S_y\) extracts horizontal edges, while the \(S_x\) extracts vertical edges. In fact, the vector \((s_y, s_x)^T\) contains the gradient information at a given location in the image. To get the direction-independent strength of the edge at that point, simply compute the Euclidean length of the gradient. To compute rotation-dependent results, use the rotation matrix on the gradient vector.

import bob.ip.base
import numpy
import math

# create test image
image = numpy.zeros((21,21))
image[5:16, 5:16] = 1

# perform Sobel filtering
sobel = bob.ip.base.sobel(image)

# compute direction-independent and direction-dependent results
abs_sobel = numpy.sqrt(numpy.square(sobel[0]) + numpy.square(sobel[1]))
angle = 45.
rot_sobel = math.sin(angle*math.pi/180) * sobel[0] + math.cos(angle*math.pi/180) * sobel[1]

# plot results
from matplotlib import pyplot
pyplot.figure(figsize=(20,4))
pyplot.subplot(151) ; pyplot.imshow(image, cmap='gray')     ; pyplot.title('Image')
pyplot.subplot(152) ; pyplot.imshow(sobel[0], cmap='gray')  ; pyplot.title('Sobel - Y')
pyplot.subplot(153) ; pyplot.imshow(sobel[1], cmap='gray')  ; pyplot.title('Sobel - X')
pyplot.subplot(154) ; pyplot.imshow(abs_sobel, cmap='gray') ; pyplot.title('Sobel - Abs')
pyplot.subplot(155) ; pyplot.imshow(rot_sobel, cmap='gray') ; pyplot.title('Sobel - %3.0f$^\circ$'%angle)
pyplot.show()

(Source code, png, hires.png, pdf)

_images/sobel.png

Normalizing face images according to eye positions

For many face biometrics applications, for instance face recognition, the images are geometrically normalized according to the eye positions. In such a case, the first thing to do is to create an object of the bob.ip.base.FaceEyesNorm class defining the image properties of the geometrically normalized image (that will be generated when applying the object):

>>> face_eyes_norm = bob.ip.base.FaceEyesNorm(eyes_distance = 65, crop_size = (128, 128), eyes_center = (32, 63.5))

Now, we have set up our object to generate images of size (128, 128) that will put the left eye at the pixel position (32, 31) and the right eye at the position (32, 96). Afterwards, this object is used to geometrically normalize the face, given the eye positions in the original face image. Note that the left eye usually has a higher x-coordinate than the right eye:

>>> face_image = bob.io.base.load( image_path )
>>> cropped_image = numpy.ndarray( (128, 128), dtype = numpy.float64 )
>>> face_eyes_norm( face_image, cropped_image, right_eye = (66, 47), left_eye = (62, 70) )

Now, let’s have a look at the original and normalized face:

import numpy
import math
import bob.io.base
import bob.ip.base
from bob.io.base.test_utils import datafile

# load a test image
face_image = bob.io.base.load(datafile('image_r10.hdf5', 'bob.ip.base', 'data/affine'))

# create FaceEyesNorm class
face_eyes_norm = bob.ip.base.FaceEyesNorm(eyes_distance = 65, crop_size = (128, 128), eyes_center = (32, 63.5))

# normalize image
normalized_image = face_eyes_norm( face_image, right_eye = (66, 47), left_eye = (62, 70) )

# plot results, including eye locations in original and normalized image
from matplotlib import pyplot
pyplot.figure(figsize=(8,4))
pyplot.subplot(121) ; pyplot.imshow(face_image, cmap='gray')       ; pyplot.plot([47, 70], [66, 62], 'rx', ms=10, mew=2); pyplot.axis('tight'); pyplot.title('Original Image')
pyplot.subplot(122) ; pyplot.imshow(normalized_image, cmap='gray') ; pyplot.plot([31, 96], [32, 32], 'rx', ms=10, mew=2); pyplot.axis('tight'); pyplot.title('Cropped Image')
pyplot.show()

(Source code, png, hires.png, pdf)

_images/face_eyes_norm.png

Simple feature extraction

Some simple feature extraction functionality is also included in the bob.ip.base module. Here is some simple example, how to extract local binary patterns (LBP) with 8 neighbors from an image:

>>> lbp_extractor = bob.ip.base.LBP(8)

You can either get the LBP feature for a single point by specifying the position:

>>> lbp_local = lbp_extractor ( cropped_image, (69, 62) )
>>> # print the binary representation of the LBP
>>> print(bin ( lbp_local ))
0b1111000

or you can extract the LBP features for all pixels in the image. In this case you need to get the required shape of the output image using the bob.ip.base.LBP feature extractor:

>>> lbp_output_image_shape = lbp_extractor.lbp_shape(cropped_image)
>>> print(lbp_output_image_shape)
(126, 126)
>>> lbp_output_image = numpy.ndarray ( lbp_output_image_shape, dtype = numpy.uint16 )
>>> lbp_extractor ( cropped_image,  lbp_output_image )
>>> # print the binary representation of the pixel at the same location as above;
>>> # note that the index is shifted by 1 since the lbp image is smaller than the original
>>> print(bin ( lbp_output_image [ 68, 61 ] ))
0b1111000

LBP-TOP extraction

LBP-TOP [Zhao2007] extraction for temporal texture analysis.

>>> import bob.ip.base
>>> import numpy
>>> numpy.random.seed(10)
>>> #Defining the lbp operator for each plane
>>> lbp_xy = bob.ip.base.LBP(8,1)
>>> lbp_xt = bob.ip.base.LBP(8,1)
>>> lbp_yt = bob.ip.base.LBP(8,1)
>>> lbptop = bob.ip.base.LBPTop(lbp_xy, lbp_xt, lbp_yt)

Defining the test 3D image and creating the containers for the outputs in each plane

>>> img3d = (numpy.random.rand(3,5,5)*100).astype('uint16')
>>> t = int(max(lbp_xt.radius, lbp_yt.radius))
>>> w = int(img3d.shape[1] - lbp_xy.radii[0]*2)
>>> h = int(img3d.shape[2] - lbp_xy.radii[1]*2)
>>> output_xy = numpy.zeros((t,w,h),dtype='uint16')
>>> output_xt = numpy.zeros((t,w,h),dtype='uint16')
>>> output_yt = numpy.zeros((t,w,h),dtype='uint16')

Extracting the bins for each plane

>>> lbptop(img3d,output_xy, output_xt, output_yt)
>>> print(output_xy)
[[[ 89   0 235]
[255  72 255]
[ 40  95   2]]]
>>> print(output_xt)
[[[ 55   2 135]
[223 130 119]
[  0 253  64]]]
>>> print(output_yt)
[[[ 45   0 173]
[247   1 255]
[130 127  64]]]