mednet.config.data.hivtb.datamodule#

HIV-TB dataset for computer-aided diagnosis (only BMP files).

Database reference: [HIV-TB-2019]

Module Attributes

CONFIGURATION_KEY_DATADIR

Key to search for in the configuration file for the root directory of this database.

Functions

make_split(basename)

Return a database split for the HIV-TB database.

Classes

DataModule(split_filename)

HIV-TB dataset for computer-aided diagnosis (only BMP files).

RawDataLoader()

A specialized raw-data-loader for the HIV-TB dataset.

mednet.config.data.hivtb.datamodule.CONFIGURATION_KEY_DATADIR = 'datadir.hivtb'#

Key to search for in the configuration file for the root directory of this database.

class mednet.config.data.hivtb.datamodule.RawDataLoader[source]#

Bases: RawDataLoader

A specialized raw-data-loader for the HIV-TB dataset.

datadir: Path#

This variable contains the base directory where the database raw data is stored.

sample(sample)[source]#

Load a single image sample from the disk.

Parameters:

sample (tuple[str, int]) – A tuple containing the path suffix, within the dataset root folder, where to find the image to be loaded, and an integer, representing the sample label.

Return type:

tuple[Tensor, Mapping[str, Any]]

Returns:

The sample representation.

label(sample)[source]#

Load a single image sample label from the disk.

Parameters:

sample (tuple[str, int]) – A tuple containing the path suffix, within the dataset root folder, where to find the image to be loaded, and an integer, representing the sample label.

Returns:

The integer label associated with the sample.

Return type:

int

mednet.config.data.hivtb.datamodule.make_split(basename)[source]#

Return a database split for the HIV-TB database.

Parameters:

basename (str) – Name of the .json file containing the split to load.

Return type:

Mapping[str, Sequence[Any]]

Returns:

An instance of DatabaseSplit.

class mednet.config.data.hivtb.datamodule.DataModule(split_filename)[source]#

Bases: CachingDataModule

HIV-TB dataset for computer-aided diagnosis (only BMP files).

  • Database reference: [HIV-TB-2019]

  • Original resolution, varying with most images being 2048 x 2500 pixels or 2500 x 2048 pixels, but not all.

Data specifications:

  • Raw data input (on disk):

    • BMP (BMP3) and JPEG grayscale images encoded as 8-bit RGB, with varying resolution

  • Output image:

    • Transforms:

      • Load raw BMP or JPEG with PIL

      • Remove black borders

      • Convert to torch tensor

      • Torch center cropping to get square image

  • Final specifications

    • Grayscale, encoded as a single plane tensor, 32-bit floats, square at 2048 x 2048 pixels

    • Labels: 0 (healthy), 1 (active tuberculosis)

Parameters:

split_filename (str) – Name of the .json file containing the split to load.