Image Processing Guide¶
The basic operations on images are the affine image conversions like image scaling, rotation, and cutting. For most of the operations, two ways of executing the functions exist. The easier API simply returns the processed image, but the second version accepts input and output objects (to allow memory reuse).
To compute a scaled version of the image, simply create the image at the desired scale. For instance, in the example below an image is up-scaled by first creating the image and then initializing the larger image:
>>> A = numpy.array( [ [1, 2, 3], [4, 5, 6] ], dtype = numpy.uint8 ) # A small image of size 2x3 >>> print(A) [[1 2 3] [4 5 6]] >>> B = numpy.ndarray( (3, 5), dtype = numpy.float64 ) # A larger image of size 3x5
bob.ip.base.scale() function of Bob is then called to up-scale the image:
>>> bob.ip.base.scale(A, B) >>> numpy.allclose(B, [[ 1.,1.5, 2., 2.5, 3.],[ 2.5, 3.,3.5, 4., 4.5],[ 4.,4.5, 5., 5.5,6. ]]) True
which bi-linearly interpolates image A to image B. Of course, scaling factors can be different in horizontal and vertical direction:
>>> C = numpy.ndarray( (2, 5), dtype = numpy.float64 ) >>> bob.ip.base.scale(A, C) >>> numpy.allclose(C, [[1., 1.5, 2., 2.5, 3.],[4., 4.5, 5., 5.5, 6. ]]) True
The rotation of an image is slightly more difficult since the resulting image
size has to be computed in advance. To facilitate this there is a function
bob.ip.base.rotated_output_shape() which can be used:
>>> A = numpy.array( [ [1, 2, 3], [4, 5, 6] ], dtype = numpy.uint8 ) # A small image of size 3x3 >>> print(A) [[1 2 3] [4 5 6]] >>> rotated_shape = bob.ip.base.rotated_output_shape( A, 90 ) >>> print(rotated_shape) (3, 2)
After the creation of the image in the desired size, the
bob.ip.base.rotate() function can be executed:
>>> A_rotated = numpy.ndarray( rotated_shape, dtype = numpy.float64 ) # A small image of rotated size >>> bob.ip.base.rotate(A, A_rotated, 90) # execute the rotation >>> numpy.allclose(A_rotated, [[ 3., 6.],[ 2., 5.],[ 1., 4.]]) True
Complex image operations¶
Complex image operations are usually wrapped up by classes. The usual work flow is to first generate an object of the desired class, specifying parameters that are independent on the images to operate, and to second use the class on images. Usually, objects that perform image operations have the __call__ function overloaded, so that one simply can use it as if it were functions. Below we provide some examples.
One simple example of image filtering is to apply a Gaussian blur filter to an image.
This can be easily done by first creating an object of the
>>> filter = bob.ip.base.Gaussian(sigma = (3., 3.), radius = (5, 5))
Now, let’s see what happens to a small test image:
import bob.ip.base import numpy import math # create test image image = numpy.zeros((21,21)) for i in range(21): image[i,i] = 255 image[-i,i] = 255 # perform Gaussian filtering gaussian = bob.ip.base.Gaussian(sigma = (3., 3.), radius = (5, 5)) smoothed = gaussian(image) # plot results from matplotlib import pyplot pyplot.figure(figsize=(8,4)) pyplot.subplot(121) ; pyplot.imshow(image, cmap='gray') ; pyplot.title('Image') pyplot.subplot(122) ; pyplot.imshow(smoothed, cmap='gray') ; pyplot.title('Smoothed') pyplot.show()
The image of the cross has now been nicely smoothed.
A second example uses Sobel filters to extract edges from an image. Two types of Sobel filters exist: The vertical filter \(S_y\) and the horizontal filter \(S_x\):
Both filters can be applied at the same time using the
bob.ip.base.sobel() function, where the result of \(S_y\) will be put to the first layer and \(S_x\) to the second layer.
>>> image = numpy.zeros((21,21)) >>> image[5:16, 5:16] = 1 >>> sobel = bob.ip.base.sobel(image) >>> sobel.shape (2, 21, 21)
Interestingly, the vertical filter \(S_y\) extracts horizontal edges, while the \(S_x\) extracts vertical edges. In fact, the vector \((s_y, s_x)^T\) contains the gradient information at a given location in the image. To get the direction-independent strength of the edge at that point, simply compute the Euclidean length of the gradient. To compute rotation-dependent results, use the rotation matrix on the gradient vector.
import bob.ip.base import numpy import math # create test image image = numpy.zeros((21,21)) image[5:16, 5:16] = 1 # perform Sobel filtering sobel = bob.ip.base.sobel(image) # compute direction-independent and direction-dependent results abs_sobel = numpy.sqrt(numpy.square(sobel) + numpy.square(sobel)) angle = 45. rot_sobel = math.sin(angle*math.pi/180) * sobel + math.cos(angle*math.pi/180) * sobel # plot results from matplotlib import pyplot pyplot.figure(figsize=(20,4)) pyplot.subplot(151) ; pyplot.imshow(image, cmap='gray') ; pyplot.title('Image') pyplot.subplot(152) ; pyplot.imshow(sobel, cmap='gray') ; pyplot.title('Sobel - Y') pyplot.subplot(153) ; pyplot.imshow(sobel, cmap='gray') ; pyplot.title('Sobel - X') pyplot.subplot(154) ; pyplot.imshow(abs_sobel, cmap='gray') ; pyplot.title('Sobel - Abs') pyplot.subplot(155) ; pyplot.imshow(rot_sobel, cmap='gray') ; pyplot.title('Sobel - %3.0f$^\circ$'%angle) pyplot.show()
Normalizing face images according to eye positions¶
For many face biometrics applications, for instance face recognition, the images are
geometrically normalized according to the eye positions. In such a case, the
first thing to do is to create an object of the
bob.ip.base.FaceEyesNorm class defining the image
properties of the geometrically normalized image (that will be generated when
applying the object):
>>> face_eyes_norm = bob.ip.base.FaceEyesNorm(eyes_distance = 65, crop_size = (128, 128), eyes_center = (32, 63.5))
Now, we have set up our object to generate images of size (128, 128) that will put the left eye at the pixel position (32, 31) and the right eye at the position (32, 96). Afterwards, this object is used to geometrically normalize the face, given the eye positions in the original face image. Note that the left eye usually has a higher x-coordinate than the right eye:
>>> face_image = bob.io.base.load( image_path ) >>> cropped_image = numpy.ndarray( (128, 128), dtype = numpy.float64 ) >>> face_eyes_norm( face_image, cropped_image, right_eye = (66, 47), left_eye = (62, 70) )
Now, let’s have a look at the original and normalized face:
import numpy import math import bob.io.base import bob.ip.base from bob.io.base.test_utils import datafile # load a test image face_image = bob.io.base.load(datafile('image_r10.hdf5', 'bob.ip.base', 'data/affine')) # create FaceEyesNorm class face_eyes_norm = bob.ip.base.FaceEyesNorm(eyes_distance = 65, crop_size = (128, 128), eyes_center = (32, 63.5)) # normalize image normalized_image = face_eyes_norm( face_image, right_eye = (66, 47), left_eye = (62, 70) ) # plot results, including eye locations in original and normalized image from matplotlib import pyplot pyplot.figure(figsize=(8,4)) pyplot.subplot(121) ; pyplot.imshow(face_image, cmap='gray') ; pyplot.plot([47, 70], [66, 62], 'rx', ms=10, mew=2); pyplot.axis('tight'); pyplot.title('Original Image') pyplot.subplot(122) ; pyplot.imshow(normalized_image, cmap='gray') ; pyplot.plot([31, 96], [32, 32], 'rx', ms=10, mew=2); pyplot.axis('tight'); pyplot.title('Cropped Image') pyplot.show()
Simple feature extraction¶
Some simple feature extraction functionality is also included in the
bob.ip.base module. Here is some simple example, how to extract
local binary patterns (LBP) with 8 neighbors from an image:
>>> lbp_extractor = bob.ip.base.LBP(8)
You can either get the LBP feature for a single point by specifying the position:
>>> lbp_local = lbp_extractor ( cropped_image, (69, 62) ) >>> # print the binary representation of the LBP >>> print(bin ( lbp_local )) 0b1111000
or you can extract the LBP features for all pixels in the image. In this case
you need to get the required shape of the output image using the
bob.ip.base.LBP feature extractor:
>>> lbp_output_image_shape = lbp_extractor.lbp_shape(cropped_image) >>> print(lbp_output_image_shape) (126, 126) >>> lbp_output_image = numpy.ndarray ( lbp_output_image_shape, dtype = numpy.uint16 ) >>> lbp_extractor ( cropped_image, lbp_output_image ) >>> # print the binary representation of the pixel at the same location as above; >>> # note that the index is shifted by 1 since the lbp image is smaller than the original >>> print(bin ( lbp_output_image [ 68, 61 ] )) 0b1111000
LBP-TOP [Zhao2007] extraction for temporal texture analysis.
>>> import bob.ip.base >>> import numpy >>> numpy.random.seed(10) >>> #Defining the lbp operator for each plane >>> lbp_xy = bob.ip.base.LBP(8,1) >>> lbp_xt = bob.ip.base.LBP(8,1) >>> lbp_yt = bob.ip.base.LBP(8,1) >>> lbptop = bob.ip.base.LBPTop(lbp_xy, lbp_xt, lbp_yt)
Defining the test 3D image and creating the containers for the outputs in each plane
>>> img3d = (numpy.random.rand(3,5,5)*100).astype('uint16') >>> t = int(max(lbp_xt.radius, lbp_yt.radius)) >>> w = int(img3d.shape - lbp_xy.radii*2) >>> h = int(img3d.shape - lbp_xy.radii*2) >>> output_xy = numpy.zeros((t,w,h),dtype='uint16') >>> output_xt = numpy.zeros((t,w,h),dtype='uint16') >>> output_yt = numpy.zeros((t,w,h),dtype='uint16')
Extracting the bins for each plane
>>> lbptop(img3d,output_xy, output_xt, output_yt) >>> print(output_xy) [[[ 89 0 235] [255 72 255] [ 40 95 2]]] >>> print(output_xt) [[[ 55 2 135] [223 130 119] [ 0 253 64]]] >>> print(output_yt) [[[ 45 0 173] [247 1 255] [130 127 64]]]