bob.ip.binseg.data.transforms

Image transformations for our pipelines

Differences between methods here and those from torchvision.transforms is that these support multiple simultaneous image inputs, which are required to feed segmentation networks (e.g. image and labels or masks). We also take care of data augmentations, in which random flipping and rotation needs to be applied across all input images, but color jittering, for example, only on the input image.

Classes

AutoLevel16to8()

Converts multiple 16-bit images to 8-bit representations using "auto-level"

CenterCrop(size)

ColorJitter([p])

Randomly applies a color jitter transformation on the first image

Compose(transforms)

Crop(i, j, h, w)

Crops multiple images at the given coordinates.

Pad(padding[, fill, padding_mode])

RandomHorizontalFlip([p])

Randomly flips all input images horizontally

RandomRotation([p])

Randomly rotates all input images by the same amount

RandomVerticalFlip([p])

Randomly flips all input images vertically

Resize(size[, interpolation, max_size, ...])

ResizeCrop()

Crop all the images by removing the black pixels in the width and height until it finds a non-black pixel.

SingleAutoLevel16to8()

Converts a 16-bit image to 8-bit representation using "auto-level"

SingleCrop(i, j, h, w)

Crops one image at the given coordinates.

SingleToRGB()

Converts from any input format to RGB, using an ADAPTIVE conversion.

ToRGB()

Converts from any input format to RGB, using an ADAPTIVE conversion.

ToTensor()

TupleMixin()

Adds support to work with tuples of objects to torchvision transforms

class bob.ip.binseg.data.transforms.TupleMixin[source]

Bases: object

Adds support to work with tuples of objects to torchvision transforms

class bob.ip.binseg.data.transforms.CenterCrop(size)[source]

Bases: bob.ip.binseg.data.transforms.TupleMixin, torchvision.transforms.transforms.CenterCrop

training: bool
class bob.ip.binseg.data.transforms.Pad(padding, fill=0, padding_mode='constant')[source]

Bases: bob.ip.binseg.data.transforms.TupleMixin, torchvision.transforms.transforms.Pad

training: bool
class bob.ip.binseg.data.transforms.Resize(size, interpolation=InterpolationMode.BILINEAR, max_size=None, antialias=None)[source]

Bases: bob.ip.binseg.data.transforms.TupleMixin, torchvision.transforms.transforms.Resize

training: bool
class bob.ip.binseg.data.transforms.ToTensor[source]

Bases: bob.ip.binseg.data.transforms.TupleMixin, torchvision.transforms.transforms.ToTensor

class bob.ip.binseg.data.transforms.Compose(transforms)[source]

Bases: torchvision.transforms.transforms.Compose

class bob.ip.binseg.data.transforms.SingleCrop(i, j, h, w)[source]

Bases: object

Crops one image at the given coordinates.

i

upper pixel coordinate.

Type

int

j

left pixel coordinate.

Type

int

h

height of the cropped image.

Type

int

w

width of the cropped image.

Type

int

class bob.ip.binseg.data.transforms.Crop(i, j, h, w)[source]

Bases: bob.ip.binseg.data.transforms.TupleMixin, bob.ip.binseg.data.transforms.SingleCrop

Crops multiple images at the given coordinates.

i

upper pixel coordinate.

Type

int

j

left pixel coordinate.

Type

int

h

height of the cropped image.

Type

int

w

width of the cropped image.

Type

int

class bob.ip.binseg.data.transforms.SingleAutoLevel16to8[source]

Bases: object

Converts a 16-bit image to 8-bit representation using “auto-level”

This transform assumes that the input image is gray-scaled.

To auto-level, we calculate the maximum and the minimum of the image, and consider such a range should be mapped to the [0,255] range of the destination image.

class bob.ip.binseg.data.transforms.AutoLevel16to8[source]

Bases: bob.ip.binseg.data.transforms.TupleMixin, bob.ip.binseg.data.transforms.SingleAutoLevel16to8

Converts multiple 16-bit images to 8-bit representations using “auto-level”

This transform assumes that the input images are gray-scaled.

To auto-level, we calculate the maximum and the minimum of the image, and consider such a range should be mapped to the [0,255] range of the destination image.

class bob.ip.binseg.data.transforms.SingleToRGB[source]

Bases: object

Converts from any input format to RGB, using an ADAPTIVE conversion.

This transform takes the input image and converts it to RGB using py:method:PIL.Image.Image.convert, with mode=’RGB’ and using all other defaults. This may be aggressive if applied to 16-bit images without further considerations.

class bob.ip.binseg.data.transforms.ToRGB[source]

Bases: bob.ip.binseg.data.transforms.TupleMixin, bob.ip.binseg.data.transforms.SingleToRGB

Converts from any input format to RGB, using an ADAPTIVE conversion.

This transform takes the input image and converts it to RGB using py:method:PIL.Image.Image.convert, with mode=’RGB’ and using all other defaults. This may be aggressive if applied to 16-bit images without further considerations.

class bob.ip.binseg.data.transforms.RandomHorizontalFlip(p=0.5)[source]

Bases: torchvision.transforms.transforms.RandomHorizontalFlip

Randomly flips all input images horizontally

training: bool
class bob.ip.binseg.data.transforms.RandomVerticalFlip(p=0.5)[source]

Bases: torchvision.transforms.transforms.RandomVerticalFlip

Randomly flips all input images vertically

training: bool
class bob.ip.binseg.data.transforms.RandomRotation(p=0.5, **kwargs)[source]

Bases: torchvision.transforms.transforms.RandomRotation

Randomly rotates all input images by the same amount

Unlike the current torchvision implementation, we also accept a probability for applying the rotation.

Parameters
  • p (float, Optional) – probability at which the operation is applied

  • **kwargs (dict) –

    passed to parent. Notice that, if not set, we use the following defaults here for the underlying transform from torchvision:

    • degrees: 15

    • interpolation: torchvision.transforms.functional.InterpolationMode.BILINEAR

training: bool
class bob.ip.binseg.data.transforms.ColorJitter(p=0.5, **kwargs)[source]

Bases: torchvision.transforms.transforms.ColorJitter

Randomly applies a color jitter transformation on the first image

Notice this transform extension, unlike others in this module, only affects the first image passed as input argument. Unlike the current torchvision implementation, we also accept a probability for applying the jitter.

Parameters
  • p (float, Optional) – probability at which the operation is applied

  • **kwargs (dict) –

    passed to parent. Notice that, if not set, we use the following defaults here for the underlying transform from torchvision:

    • brightness: 0.3

    • contrast: 0.3

    • saturation: 0.02

    • hue: 0.02

training: bool
class bob.ip.binseg.data.transforms.ResizeCrop[source]

Bases: object

Crop all the images by removing the black pixels in the width and height until it finds a non-black pixel.