Training#
Overview#
In total, we support RandomForest, XGBoost, and LightGBM.
The DetectorTraining resembles the main function to fit any model.
After initialisation,
It supports various data sets:
all: Includes all available data setscic: Train on the CICBellDNS2021 data setdgta: Train on the DTGA Benchmarking data setdgarchive: Train on the DGArchive data set
For hyperparameter optimisation we use optuna.
It offers GPU support to get the best parameters.
Training Parameters#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Data set to train model, choose between all available datasets. |
|
|
Dataset path, follow folder structure. |
|
|
|
|
Maximum rows to load from each dataset. |
|
|
Model to train, choose between XGBoost, RandomForest, or GBM. |
|
|
|
|
Path to store model. Output is |
Testing Parameters#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Data set to test model. |
|
|
Dataset path, follow folder structure. |
|
|
|
|
Maximum rows to load from each dataset. |
|
|
Model architecture to test. |
|
|
|
Path to trained model. |
Explanation Parameters#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Data set to explain model predictions. |
|
|
Dataset path, follow folder structure. |
|
|
|
|
Maximum rows to load from each dataset. |
|
|
Model architecture to explain. |
|
|
|
Path to trained model. |