-
Notifications
You must be signed in to change notification settings - Fork 6
UTIL CLASS_CONFIG
A Class Config CSV allows for class names to be combined, renamed, and excluded at model training runtime. This is advantageous to a user who may wish to combine or exclude certain classes without having to manually maintain multiple iterations of a baseline dataset. Class configurations saves a user from time-consuming, diskspace-consuming, error-prone copy/rename/delete processes related to training with variations on dataset.
A Class Config CSV is used by specifying the CSV filename followed-by one of its configuration names after the --class-config
flag of neuston_net.py TRAIN
. Example usage will be shown below.
Consider the following example Dataset Directories D1
, and the Class Config CSV D1_classconfig.csv
.
path/to/
└─ D1/
├─ amoeba/
├─ Diatom_sp1/
├─ Diatom_sp2/
├─ unknown1/
└─ bad/
path/to/D1, CONFIG1, CONFIG2
amoeba, Amoeba, 1
Diatom_sp1, 1, diatom
Diatom_sp2, 1, diatom
unknown1, 1, 0
bad, 0, 0
neuston_net.py TRAIN path/to/D1 inception_v3 YourTrainingID --class-config D1_classconfig.csv CONFIG2
Note the following properties common to all Dataset Config CSVs:
- The baseline dataset is in the first cell
- The first column is a list of all available classes in the baseline dataset
- Class names are case sensitive and duplicate names are invalid
- The very first cell contains the baseline dataset, though it may be left blank without consequence
- Subsequent columns each represent a particular class configuration
- Each configuration column header must be uniquely named. This is the Configuration Name
- There may be any number of configuration columns
- A
0
in a configuration column cell indicates that the corresponding class should be excluded for that configuration - A
1
in a configuration column cell indicates that that class should be included for that configuration - Text in a configuration column cell renames and includes the class, possibly allowing it to be combined with another class available in the baseline dataset.
The example Class Config CSV above has two configurations:
- CONFIG1
- the class
bad
is excluded - the class
amoeba
is renamed to be capitalized for aesthetic purposes - A model trained with this configuration would have four output classes:
Amoeba
,Diatom_sp1
,Diatom_sp2
,unknown1
- the class
- CONFIG2
-
Diatom_sp1
andDiatom_sp2
are combined into a singlediatom
class -
bad
andunknown1
are excluded - A model trained with this configuration would have only two output classes:
amoeba
anddiatom
.
-
In the example above, CONFIG2
is being used which will result in a model with only two output classes. The --class-config
flag only accept two positional arguments, the class configuration csv and one configuration name from that csv. See Dataset Params or the following excerpt.
--class-config CSV COL Skip and combine classes as defined by column COL of a CSV configuration file.
A baseline class config csv for a given dataset can be generated using neuston_util.py MAKE_CLASS_CONFIG
. The command automatically generates the first-column of all available classes available in the dataset, and a second column with a generic all-inclusive configuration of all-1
's. It is up to the user to further edit the csv in order to rename/combine classes, exclude classes, edit the column header to some significant configuration name, or create additional (uniquely-named) configuration columns.
neuston_util.py MAKE_CLASS_CONFIG path/to/D1 -o D1_classconfig.csv
usage: neuston_util.py MAKE_CLASS_CONFIG [-h] [-o OUTFILE] PATH
positional arguments:
PATH path to a dataset directory or dataset configuration csv file.
optional arguments:
-h, --help show this help message and exit
-o OUTFILE Specify an output file. If unset, outputs to stdout.
A Dataset Configuration CSV and Class Config CSV may be used in conjunction. First, the Dataset Configuration CSV should be generated and edited as desired; it represents a novel Dataset. Then, use the Dataset Configuration CSV (instead of a Dataset Directory) to generate the baseline Class Config CSV.
eg: neuston_util.py MAKE_CLASS_CONFIG path/to/D1D2_config.csv -o D1D2_classconfig.csv
From there, edit Class Config CSV as usual. The final training command would look like as follows:
eg: neuston_net.py TRAIN path/to/D1D2_config.csv inception_v3 SomeTrainingID --class-config D1D2_classconfig.csv CONFIG1
- Overview
- Installation
- local
- whoi hpc
- Training a Model
- Running a Model
- Utilities
- SLURM SBATCH Tool ⊛
- Dupes Training ⊛
- Tips
- HPC Patch Notes