The repo contains the source code and pipeline configurations to automate retraining of the test rig forecaster. The code uses Cloud Functions to trigger retraining on Object finalized trigger (see triggers), Cloud Build to automate testing, building and publishing to the Google Container Registry the new Docker image, Vertex AI Pipelines with Kubeflow SDK to orchestrate execution of the pipeline steps, Vertex AI Experiments to track parameters and metrics during training, Tensorboard callback within the training step to record the timeseries of losses and metrics progressions during the training process and finally Vertex AI Model Registry to store the champion models.
Step | Description | Inputs | Outputs | Parameters |
---|---|---|---|---|
read-raw-data |
Reads raw data files from the GCS raw data bucket storage. Uploads the combined data frame to the interim data directory in the GCS bucket | interim_data all_features |
raw_data_path features_path interim_data_path |
|
importer |
Imports interim features | interim_features |
artifact_uri |
|
build-features |
Reads the interim data, builds features (float down casting, removes NaNs and the step zero data, adds the power and time features to the processed data), saves the processed data | interim_features interim_data |
processed_data processed_features |
features_path processed_data_path |
split-data |
Splits processed data into train and test data | processed_data |
train_data test_data |
train_data_size |
import-forecast-features |
Imports forecast features | forecast_features |
features_path |
|
train |
Instantiates, trains the RNN model on the train dataset. Saves the trained scaler and the keras model to the metadata store, logs the training metrics and tensorboard event file | train_data |
scaler_model keras_model train_metrics parameters |
project_id region feature lookback lstm_units learning_rate epochs batch_size patience timestamp train_data_size pipelines_path |
evaluate |
Evaluates the trained keras model, saves the evaluation metrics to the metadata store | test_data scaler_model keras_model |
eval_metrics |
project_id region feature lookback batch_size timestamp |
import-champion-metrics |
Imports champion metrics | champion_metrics |
features_path |
|
compare-models |
Compares evaluation metrics of the trained (challenger) model and the champion (the one in the model registry) | eval_metrics champion_metrics |
evaluation_metric absolute_difference |
|
upload-model-to-registry |
Uploads the scaler and keras models to the models registry. Uploads the parameters and metrics of the model | parameters scaler_model keras_model eval_metrics |
feature project_id region deploy_image models_path |