bob.ip.common.data.transforms

Image transformations for our pipelines

Differences between methods here and those from torchvision.transforms is that these support multiple simultaneous image inputs, which are required to feed segmentation networks (e.g. image and labels or masks). We also take care of data augmentations, in which random flipping and rotation needs to be applied across all input images, but color jittering, for example, only on the input image.

Classes

AutoLevel16to8()

Converts multiple 16-bit images to 8-bit representations using "auto-level"

CenterCrop(size)

ColorJitter([p])

Randomly applies a color jitter transformation on the first image

Compose(transforms)

Crop(i, j, h, w)

Crops multiple images at the given coordinates.

GaussianBlur([p])

Randomly applies a gaussian blur transformation on the first image

GetBoundingBox([image, reference])

Returns image tensor and its corresponding target dict given a mask.

Pad(padding[, fill, padding_mode])

RandomHorizontalFlip([p])

Randomly flips all input images horizontally

RandomRotation([p])

Randomly rotates all input images by the same amount

RandomVerticalFlip([p])

Randomly flips all input images vertically

Resize(size[, interpolation, max_size, ...])

ShrinkIntoSquare([reference, threshold])

Crops black borders and then resize to a square with minimal padding

SingleAutoLevel16to8()

Converts a 16-bit image to 8-bit representation using "auto-level"

SingleCrop(i, j, h, w)

Crops one image at the given coordinates.

SingleToRGB()

Converts from any input format to RGB, using an ADAPTIVE conversion.

ToRGB()

Converts from any input format to RGB, using an ADAPTIVE conversion.

ToTensor()

TupleMixin()

Adds support to work with tuples of objects to torchvision transforms

class bob.ip.common.data.transforms.TupleMixin[source]

Bases: object

Adds support to work with tuples of objects to torchvision transforms

class bob.ip.common.data.transforms.CenterCrop(size)[source]

Bases: TupleMixin, CenterCrop

class bob.ip.common.data.transforms.Pad(padding, fill=0, padding_mode='constant')[source]

Bases: TupleMixin, Pad

class bob.ip.common.data.transforms.Resize(size, interpolation=InterpolationMode.BILINEAR, max_size=None, antialias=None)[source]

Bases: TupleMixin, Resize

class bob.ip.common.data.transforms.ToTensor[source]

Bases: TupleMixin, ToTensor

class bob.ip.common.data.transforms.Compose(transforms)[source]

Bases: Compose

class bob.ip.common.data.transforms.SingleCrop(i, j, h, w)[source]

Bases: object

Crops one image at the given coordinates.

i

upper pixel coordinate.

Type

int

j

left pixel coordinate.

Type

int

h

height of the cropped image.

Type

int

w

width of the cropped image.

Type

int

class bob.ip.common.data.transforms.Crop(i, j, h, w)[source]

Bases: TupleMixin, SingleCrop

Crops multiple images at the given coordinates.

i

upper pixel coordinate.

Type

int

j

left pixel coordinate.

Type

int

h

height of the cropped image.

Type

int

w

width of the cropped image.

Type

int

class bob.ip.common.data.transforms.SingleAutoLevel16to8[source]

Bases: object

Converts a 16-bit image to 8-bit representation using “auto-level”

This transform assumes that the input image is gray-scaled.

To auto-level, we calculate the maximum and the minimum of the image, and consider such a range should be mapped to the [0,255] range of the destination image.

class bob.ip.common.data.transforms.AutoLevel16to8[source]

Bases: TupleMixin, SingleAutoLevel16to8

Converts multiple 16-bit images to 8-bit representations using “auto-level”

This transform assumes that the input images are gray-scaled.

To auto-level, we calculate the maximum and the minimum of the image, and consider such a range should be mapped to the [0,255] range of the destination image.

class bob.ip.common.data.transforms.SingleToRGB[source]

Bases: object

Converts from any input format to RGB, using an ADAPTIVE conversion.

This transform takes the input image and converts it to RGB using py:method:PIL.Image.Image.convert, with mode=’RGB’ and using all other defaults. This may be aggressive if applied to 16-bit images without further considerations.

class bob.ip.common.data.transforms.ToRGB[source]

Bases: TupleMixin, SingleToRGB

Converts from any input format to RGB, using an ADAPTIVE conversion.

This transform takes the input image and converts it to RGB using py:method:PIL.Image.Image.convert, with mode=’RGB’ and using all other defaults. This may be aggressive if applied to 16-bit images without further considerations.

class bob.ip.common.data.transforms.RandomHorizontalFlip(p=0.5)[source]

Bases: RandomHorizontalFlip

Randomly flips all input images horizontally

class bob.ip.common.data.transforms.RandomVerticalFlip(p=0.5)[source]

Bases: RandomVerticalFlip

Randomly flips all input images vertically

class bob.ip.common.data.transforms.RandomRotation(p=0.5, **kwargs)[source]

Bases: RandomRotation

Randomly rotates all input images by the same amount

Unlike the current torchvision implementation, we also accept a probability for applying the rotation.

Parameters
  • p (float, Optional) – probability at which the operation is applied

  • **kwargs (dict) –

    passed to parent. Notice that, if not set, we use the following defaults here for the underlying transform from torchvision:

    • degrees: 15

    • interpolation: torchvision.transforms.functional.InterpolationMode.BILINEAR

class bob.ip.common.data.transforms.ColorJitter(p=0.5, **kwargs)[source]

Bases: ColorJitter

Randomly applies a color jitter transformation on the first image

Notice this transform extension, unlike others in this module, only affects the first image passed as input argument. Unlike the current torchvision implementation, we also accept a probability for applying the jitter.

Parameters
  • p (float, Optional) – probability at which the operation is applied

  • **kwargs (dict) –

    passed to parent. Notice that, if not set, we use the following defaults here for the underlying transform from torchvision:

    • brightness: 0.3

    • contrast: 0.3

    • saturation: 0.02

    • hue: 0.02

class bob.ip.common.data.transforms.ShrinkIntoSquare(reference=0, threshold=0)[source]

Bases: object

Crops black borders and then resize to a square with minimal padding

This transform can crop all the images by removing the black pixels in the width and height until it finds a non-black pixel. Then, expands the image back until it makes a square with minimal size.

Parameters
  • reference (int, Optional) – Which reference part of the sample to use for cropping black borders. If not set, use the first object on the sample (typically, the image).

  • threshold (int, Optional) – Threshold to use for when considering what is a “black” border

class bob.ip.common.data.transforms.GaussianBlur(p=0.5, **kwargs)[source]

Bases: GaussianBlur

Randomly applies a gaussian blur transformation on the first image

Notice this transform extension, unlike others in this module, only affects the first image passed as input argument. Unlike the current torchvision implementation, we also accept a probability for applying the blur.

Parameters
  • p (float, Optional) – probability at which the operation is applied

  • **kwargs (dict) –

    passed to parent. Notice that, if not set, we use the following defaults here for the underlying transform from torchvision:

    • kernel_size: (5, 5)

    • sigma: (0.1, 5)

class bob.ip.common.data.transforms.GetBoundingBox(image=0, reference=1)[source]

Bases: object

Returns image tensor and its corresponding target dict given a mask.

Parameters
  • image (int, Optional) – Which reference part of the sample is the image.

  • reference (int, Optional) – Which reference part of the sample to use for getting bbox. If not set, use the second object on the sample (typically, the mask).