Library Documentation
Video walkthrough
A 50-minute video walkthrough for PriMIA can be found here
Documentation for the individual components
This section describes what the individual components of PriMIA do.
-
configs
This folder contains the configuration files.torch
contains theconfig.ini
files which are used for configuring the training process.websetting
contains files for usage with WebSocket/HTTP training over the network. Documentation for these files can be found here -
data
This folder contains the dataset used in our publication and case study. After runningmake <symbolic/minimal>_server folders
, aserver_simulation
folder appears here which holds the data for the VirtualWorker nodes. Thetrain
andtest
folders hold the original dataset. The two.csv
files contain metadata and labels. -
docs
This folder contains the documentation source and is not needed by the end user. -
figure_scripts
This folder contains scripts for re-creating the figures for our publication. It is not needed by the end user. -
model_weights
This folder does not always exist. It will be generated after a training run and stores model weights and a.csv
file with the training metrics. -
Node
This folder contains the code for setting up a PyGrid node. Documentation for how to use this can be found here -
site
This folder contains the HTML and CSS for the documentation. It should not be changed by the end user -
syft
This folder contains the PySyft source code. It should not be altered unless you really know what you are doing. -
torchlib
This folder contains all the utility modules (the engine room) for PriMIA.dataloader.py
contains theCombinedLoader
class which is the main dataloader for PriMIA and loads image and medical image files. It also contains various utility functions required for PriMIA's operation and ways to load both the supplied paediatric pneumonia dataset and the MNIST dataset which can be used for experimentation.dicomtools.py
contains the logic for loading DICOM files. Documentation for the module can be found in the module's source.find_config.py
contains the logic to run hyperparameter optimisation with Optuna. Its documentation can be found heremodels.py
contains the models available in PriMIA, a simple CNN (calledsimpleconv
), a VGG-style CNN and the ResNet18 used in our publicationrun_websocket_server.py
is a utility script to quickly run a set of WebSocket/HTTP servers to train models. It is not required for operation since servers can be run directly from theNode
module.utils.py
contains the entire logic for federated learning. If you are interested in the internals of PriMIA, this is where to look.
-
DATASET_LICENSE
This is the license for the dataset used, which is different from the source code license. Please read the license and use PriMIA and the data only in accordance to its terms. -
doc_requirements.txt
This is the requirements file to build the documentation. It is normally not needed by the end user. -
environment_torch.yml
This is the preferred way to install dependencies. See here for more information on setting up dependencies and an environment to run PriMIA. -
inference.py
This is the entrypoint to perform (encrypted inference) with PriMIA and implements the core logic. If you are interested in the internals of encrypted inference, this is the file you are looking for. -
LICENSE
This is the license for the source code, which is different from the dataset license. Please read the license and use PriMIA and the data only in accordance to its terms. -
Makefile
This file contains a large amount of pre-made recipes which canmake
(pun intended) your life much easier. See here for how to use these recipes. -
mkdocs.yml
This is a configuration file for the documentation. It is normally not needed by the end user -
README.md
This contains the GitHub readme. It is not needed by the end user. -
requirements.txt
This is the other (and not preferred) way to install dependencies. See here for more information on setting up dependencies and an environment to run PriMIA. -
test.py
This is the entrypoint to test a trained algorithm on data which is already set up in named folders identical to the ones in our dataset. It is a convenience script to obtain model metrics and does not represent the main use-case for PriMIA. -
train.py
This is the entrypoint to federated or local model training and contains all the logic of the training loop. If you are interested in how training in PriMIA is carried out, this is the script to look at.