Dataset Description
The dataset in this repository is being re-used under the license terms from the CoronaHack Chest X-Ray Dataset on Kaggle and modified as indicated in Dataset_Description.md
.
Original citation for the majority of the images: https://data.mendeley.com/datasets/rscbjbr9sj/2
This work is licensed under a Creative Commons Attribution 4.0 International License.
The following modifications were made to create a dataset of frontal chest X-Rays of children and adolescents:
-
All adult images (based on ossification status and/or degenerative changes to the spine) were removed. Example:
-
All non frontal (i.e. not PA or AP) images were removed. Example:
-
All CT-images were removed. Example:
-
All non-grayscale images were removed. Example:
-
All images considered of non-diagnostic quality were removed. Example:
The final training set consists of 5163 images, the test set of 624 images.
The original metadata file was modified to represent the new dataset. Labels were encoded as follows:
Label | Description |
---|---|
0 | Normal |
1 | Bacterial Pneumonia |
2 | Viral Pneumonia |
The original metadata file is data/Chest_xray_Corona_Metadata.csv
and the new one is data/Labels.csv
.