Training pipeline

The repo contains the source code and pipeline configurations to automate retraining of the test rig forecaster. The code uses Cloud Functions to trigger retraining on Object finalized trigger (see triggers), Cloud Build to automate testing, building and publishing to the Google Container Registry the new Docker image, Vertex AI Pipelines with Kubeflow SDK to orchestrate execution of the pipeline steps, Vertex AI Experiments to track parameters and metrics during training, Tensorboard callback within the training step to record the timeseries of losses and metrics progressions during the training process and finally Vertex AI Model Registry to store the champion models.

Pipeline diagram

Pipeline steps

Step	Description	Inputs	Outputs	Parameters
`read-raw-data`	Reads raw data files from the GCS raw data bucket storage. Uploads the combined data frame to the interim data directory in the GCS bucket		`interim_data` `all_features`	`raw_data_path` `features_path` `interim_data_path`
`importer`	Imports interim features		`interim_features`	`artifact_uri`
`build-features`	Reads the interim data, builds features (float down casting, removes NaNs and the step zero data, adds the power and time features to the processed data), saves the processed data	`interim_features` `interim_data`	`processed_data` `processed_features`	`features_path` `processed_data_path`
`split-data`	Splits processed data into train and test data	`processed_data`	`train_data` `test_data`	`train_data_size`
`import-forecast-features`	Imports forecast features		`forecast_features`	`features_path`
`train`	Instantiates, trains the RNN model on the train dataset. Saves the trained scaler and the keras model to the metadata store, logs the training metrics and tensorboard event file	`train_data`	`scaler_model` `keras_model` `train_metrics` `parameters`	`project_id` `region` `feature` `lookback` `lstm_units` `learning_rate` `epochs` `batch_size` `patience` `timestamp` `train_data_size` `pipelines_path`
`evaluate`	Evaluates the trained keras model, saves the evaluation metrics to the metadata store	`test_data` `scaler_model` `keras_model`	`eval_metrics`	`project_id` `region` `feature` `lookback` `batch_size` `timestamp`
`import-champion-metrics`	Imports champion metrics		`champion_metrics`	`features_path`
`compare-models`	Compares evaluation metrics of the trained (challenger) model and the champion (the one in the model registry)	`eval_metrics` `champion_metrics`		`evaluation_metric` `absolute_difference`
`upload-model-to-registry`	Uploads the scaler and keras models to the models registry. Uploads the parameters and metrics of the model	`parameters` `scaler_model` `keras_model` `eval_metrics`		`feature` `project_id` `region` `deploy_image` `models_path`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Training pipeline

Pipeline diagram

Pipeline steps

Directed acyclic graph

Files

README.md

Latest commit

History

README.md

File metadata and controls

Training pipeline

Pipeline diagram

Pipeline steps

Directed acyclic graph