This repository contains source code for the NeedForHeat dynamic heat balance and mass balance models and analysis tools, based on GEKKO Python. This analysis software can be regarded as a particular form of physics informed machine learning for automated estimation of crucial parameters of buildings, installations and comfort needs in individual homes and utility buildings based on time-series monitoring data.
This repository contains the GEKKO Python-based implementation of physics informed machine learning (f.k.a. inverse grey-box analysis). The purpose of this sofware is to (help) speedup the energy transition, in particular the heating transition.
This software was developed in the context of multiple projects:
- Twomes, our first research project aimed at building digital twins that used inverse grey-box modelling to learn physical building parameters;
- Brains4Buildings, a research project in which we explored relation between occupancy, ventilation rates and CO2 concentration;
- REDUCEDHEATCARB, a research project in which we extend the models from these earlier projects with models for ventilation heat loss, wind-dependent infiltration heat loss and a model that seprates heat generation (in a boiler orand/ heat pump) from heat distribution (e.g. via hydronic radiators).
This repository also contains synthetic home data and synthetic room data that we generated used to verify the proper implementation of the GEKKO models.
Field data, including metadata descriptions can be found in these related repositories:
We recommend to install and use JupyterLab in a docker container on a server.
As an alternative, you can use a JupyterLab environment on your local machine. Other environments, such as PyCharm and Visual Studio Code may work as well, but we do not include documentation for this here.
To use JupyterLab on your local machine, make sure you have the following software properly installed and configured on your machine:
If you haven't already installed it, go to Python and install at least version 3.8, which includes pip
, the package manager you need in the steps below.
If you haven't already installed JupyterLab, you can install JupyterLab with the following pip command in your terminal:
pip install jupyterlab
To add support for git from within the JupyterLab environment, issue the following command in your terminal:
pip install jupyterlab-git
Once you've installed JupyterLab, you can launch JupyterLab by running the following command in your terminal:
Note In the step after this, we explain how to clone this repository from within jupyter-lab. If you are working locally and are familiar with using git, you can manually clone this repository, and open a terminal in the cloned directory before running the command below. You can then skip step 3.
jupyter-lab
This will open JupyterLab in your default web browser.
In JupyterLab, navigate to the folder where you would like to clone this repository, select the git-icon in the left pane, select Clone a Repository
and pase the URI for this repository, which is available via the green <> Code
button on the GitHub page of this repository.
After cloning this this respository, install requirements: open a terminal in JuypyterLab (available via the JupyterLab Launcher, via the +
tab), navigate to the root folder of your clone of this repository and enter this command:
pip install -r requirements.txt
This will install all the required dependencies listed in the requirements.txt
file.
This section describes how you can use the IPython notebooks, without changing the Python code. After installing JupyterLab as described above, you can run the software by opening up .ipynb
files and run the contents from the examples
folder. We've created example files based on or work in multiple projects:
<Project>ExtractionBackup.ipynb
files contain code you can run to extract measurement data from a NeedForHeat DataGear server and save it as parquet files. These .ipynb files only work when you run the code in a JupyterLab environment that has access to the MariaDB database on a NeedForHeat DataGear server.<Project>_to_CSV.ipynb
files contain code you can run to convert a parquet file containing DataFrames to multiple zipped csv files, a single file containing all measurements and one zipped csv file per id. Parquet files load faster and are smaller than zipped csv files. Nevertheless, for backward compatibility with data analytics tools that are not yet able to process parquet files, we used the code in these .ipynb files to create the contents for the open data repositories. You can find more information about the formatting of DataFrames with measurements and DataFrames with properties, as well as the open data itself in the repositories twomes-dataset-assendorp2021 and brains4buildings-dataset-windesheim2022.<Project>_analysis_virtual_ds.ipynb
files contain code you can run to verify whether a mathematical model is properly implemented in GEKKO code the functionslearn_home_parameters()
orlearn_room_parameters()
frominversegreyboxmodel.py
. To perform the validation, we created 'virtual data', i.e. time series data for a virtual home or virtual room that behaves exactly according to the the mathematical model and has no measurement errors nor measurement hickups. This 'virtual data' was generated using an Excel implementation of the same mathematical model. You can find the virtual data in the/data/<project>_virtual_ds/
folders.<Project>_PlotTest.ipynb
files contain example code that demonstrate the various ways you can plot (parts of) a DataFrame contraining properties or preprocessed data, using the functionsdataframe_properties_plot()
anddataframe_preprocessed_plot()
, respectively, fromplotter.py
.<Project>_analysis_real_ds.ipynb
files contain the functionslearn_home_parameters()
orlearn_room_parameters()
frominversegreyboxmodel.py
to perform grey-box analysis, on datasets with real measurements. Currently, we set up these analysis functions to perform various steps:- Read the parquet files from
<Project>ExtractionBackup.ipynb
, which contains a properties DataFrame. - Preprocess the data to make it suitable for analysis, which involves both outlier removal and time-based interpolation and which ultimately results in a preprocessed DataFrame.
- Perform the analysis over consecutive learning periods, e.g. 7 days or 3 days, resulting in both a DataFrame with learned variables and error metrics per id per learning period and a DataFrame with the resulting optimal time series for the property used as the fitting objective and the values of any learned time-dependent properties.
- Visualize the analysis results in graphs.
- Read the parquet files from
This section describes how you can change the source code. You can do this using JupyterLab, as described in the section Deploying. Other development environments, such as PyCharm and Visual Studio Code may work as well, but we do not include documentation for this here.
Should you find any issues or bugs in our code, please report them via the issues tab of this repository.
To change the code, we recommend:
- Try out your changes using the various
.ipynb
files from theexamples
folder. The section Deploying contains a high level description of these files. - Migrate stable code to functions in Python files.
- Should you have extensions or bug fixes that could be useful for other users of the repository as well, please fork this reposotory and make a Pull Request on this repository.
Current features include:
- data extraction;
- data preprocessing: measurement outlier removal and interpolation;
learn_home_parameters()
function ininversegreyboxmodel.py
that uses a GEKKO model and code to learn building model parameters such as specific heat loss [W/K], thermal intertia [h], thermal ass [Wh/K] and apparent solar aperture [m2] of a building;learn_room_parameters()
function ininversegreyboxmodel.py
that uses a GEKKO model and code to learn:- apparent infiltration area [m2] and ventilation rates [m3/h] from CO2 concentration [ppm] and occupancy [p] time series data;
- apparent infiltration area [m2] and occupancy [p] from from CO2 concentration [ppm] and ventilation rates [m3/h] time series data;
To-do:
- update code in the
learn_home_parameters()
function to align with the newer code and preprocessing tools used in thelearn_room_parameters()
function; - extend GEKKO model in
learn_home_parameters()
with installation model details to learn installation parameters; - add 'dynamic' measurement outlier removal for measurement time series before interpolation, i.e. a rolling window outlier removal procedure, similar to a hampel filter but working on non-equidistant time-series data and using a duration as a time window;
- combine the models in
learn_home_parameters()
andlearn_room_parameters()
and apply on a suitable dataset to figure out whether adding CO2 concentration and occupancy [p] time series data helps to learn ventilation losses and other heat losses separately. - add time series data about wind to the model and figure out whether (wind-induced) infiltration losses, ventilation losses and other heat losses of a building can be learned separately.
Project is: in progress
This software is available under the Apache 2.0 license, Copyright 2021 Research group Energy Transition, Windesheim University of Applied Sciences
This software is a collaborative effort of:
- Henri ter Hofte · @henriterhofte · Twitter @HeNRGi
- Hossein Rahmani · @HosseinRahmani64
It is partially based on earlier work by the following students:
- Casper Bloemendaal · @Bloemendaal
- Briyan Kleijn · @BriyanKleijn
- Nathan Snippe · @nsrid
- Jeroen Matser · @Spudra
- Steven de Ronde · @SteviosSDR
- Joery Grolleman · @joerygrolleman
Product owner:
- Henri ter Hofte · @henriterhofte · Twitter @HeNRGi
We use and gratefully acknowlegde the efforts of the makers of the following source code and libraries:
- GEKKO Python, by Advanced Process Solutions, LLC., licensed under an MIT-style licence
- Twomes Analysis Pipeline, v1, by Research group Energy Transition, Windesheim University of Applied Sciences, licensed under Apache-2.0 License
- HourlyHistoricWeather, by @stephanpcpeters, licensed under an MIT-style licence