mednet.config.data.nih_cxr14.datamodule#

NIH CXR14 (relabeled) DataModule for computer-aided diagnosis.

Database reference: [NIH-CXR14-2017]

Module Attributes

CONFIGURATION_KEY_DATADIR

Key to search for in the configuration file for the root directory of this database.

CONFIGURATION_KEY_IDIAP_FILESTRUCTURE

Key to search for in the configuration file indicating if the loader should use standard or idiap-based file organisation structure.

Functions

make_split(basename)

Return a database split for the NIH CXR-14 database.

Classes

DataModule(split_filename)

NIH CXR14 (relabeled) DataModule for computer-aided diagnosis.

RawDataLoader()

A specialized raw-data-loader for the NIH CXR-14 dataset.

mednet.config.data.nih_cxr14.datamodule.CONFIGURATION_KEY_DATADIR = 'datadir.nih_cxr14'#

Key to search for in the configuration file for the root directory of this database.

mednet.config.data.nih_cxr14.datamodule.CONFIGURATION_KEY_IDIAP_FILESTRUCTURE = 'nih_cxr14.idiap_folder_structure'#

Key to search for in the configuration file indicating if the loader should use standard or idiap-based file organisation structure.

It causes the internal loader to search for files in a slightly different folder structure, that was adapted to Idiap’s requirements (number of files per folder to be less than 10k).

class mednet.config.data.nih_cxr14.datamodule.RawDataLoader[source]#

Bases: RawDataLoader

A specialized raw-data-loader for the NIH CXR-14 dataset.

datadir: Path#

This variable contains the base directory where the database raw data is stored.

idiap_file_organisation: bool#

If should use the Idiap’s filesystem organisation when looking up data.

This variable will be True, if the user has set the configuration parameter nih_cxr14.idiap_file_organisation in the global configuration file. It will cause internal loader to search for files in a slightly different folder structure, that was adapted to Idiap’s requirements (number of files per folder to be less than 10k).

sample(sample)[source]#

Load a single image sample from the disk.

Parameters:

sample (tuple[str, list[int]]) – A tuple containing the path suffix, within the dataset root folder, where to find the image to be loaded, and an integer, representing the sample label.

Return type:

tuple[Tensor, Mapping[str, Any]]

Returns:

The sample representation.

label(sample)[source]#

Load a single image sample label from the disk.

Parameters:

sample (tuple[str, list[int]]) – A tuple containing the path suffix, within the dataset root folder, where to find the image to be loaded, and an integer, representing the sample label.

Returns:

The integer labels associated with the sample.

Return type:

list[int]

mednet.config.data.nih_cxr14.datamodule.make_split(basename)[source]#

Return a database split for the NIH CXR-14 database.

Parameters:

basename (str) – Name of the .json file containing the split to load.

Return type:

Mapping[str, Sequence[Any]]

Returns:

An instance of DatabaseSplit.

class mednet.config.data.nih_cxr14.datamodule.DataModule(split_filename)[source]#

Bases: CachingDataModule

NIH CXR14 (relabeled) DataModule for computer-aided diagnosis.

This dataset was extracted from the clinical PACS database at the National Institutes of Health Clinical Center (USA) and represents 60% of all their radiographs. It contains labels for 14 common radiological signs in this order: cardiomegaly, emphysema, effusion, hernia, infiltration, mass, nodule, atelectasis, pneumothorax, pleural thickening, pneumonia, fibrosis, edema and consolidation. This is the relabeled version created in the CheXNeXt study.

  • Reference: [NIH-CXR14-2017]

  • Raw data input (on disk):

    • PNG RGB 8-bit depth images

    • Resolution: 1024 x 1024 pixels

  • Labels: [CHEXNEXT-2018]

  • Split reference: [CHEXNEXT-2018]

  • Output image:

    • Transforms:

      • Load raw PNG with PIL

      • Convert to torch tensor

    • Final specifications:

      • RGB, encoded as a 3-plane tensor, 32-bit floats, square (1024x1024 px)

      • Labels in order: * cardiomegaly * emphysema * effusion * hernia * infiltration * mass * nodule * atelectasis * pneumothorax * pleural thickening * pneumonia * fibrosis * edema * consolidation

Parameters:

split_filename (str) – Name of the .json file containing the split to load.