POOPy = Pollution discharge monitoring with Object Oriented Python
This is a Python package for interfacing with live data from Event Duration Monitoring (EDM) devices maintained by English and Welsh Water Companies. This package was ostensibly developed to provide the back-end for SewageMap.co.uk but may be generically useful for those exploring the impact of sewage discharges on rivers. Currently, POOPy
supports data from all of the major water and sewerage companies:
Water Company | WaterCompany Object Name |
---|---|
Thames Water | ThamesWater |
Welsh Water/Dŵr Cymru | WelshWater |
Southern Water | SouthernWater |
Anglian Water | AnglianWater |
United Utilities | UnitedUtilities |
Severn Trent | SevernTrent |
Yorkshire Water | YorkshireWater |
Northumbrian Water | NorthumbrianWater |
South West Water | SouthWestWater |
Wessex Water | WessexWater |
Different water companies share their live EDM data via APIs with different formats. This is obviously confusing and means that it is hard to access national data simultaneously and ultimately understand their potential impact on the environment. POOPy
solves this problem by encapsulating relevant information about EDM monitors maintained by different water companies into a standardised interface. This interface (represented by the WaterCompany
and Monitor
classes) makes it very easy to, for instance, quickly identify monitors that are discharging, have discharged in the last 48 hours or are offline. POOPy
combines this information with key meta-data about the monitor such as location and the watercourse it discharges into. Additionally, POOPy
provides a basic approache to explore the 'impact' of discharges on the environment, using a simple hydrological model to identify river sections downstream of sewage discharges in real-time. POOPy
could easily be extended to consider more complicated ways of exploring the 'impact' of sewage spills (e.g., dynamic river flow).
Where historical information on CSO discharges are available (currently only provided as an API by Thames Water), POOPy
processes this information making it very easy to query the spill history of a particular monitor. For instance, to calculate the total hours of sewage discharge from a given monitor over a given timeframe. Experimentally, POOPy
also has capabilities to 'build' histories of sewage spills from repeated queries to the current status of a monitor, even if (in the case of most water companies) this information is not made readily accessible.
Install this package by running the following command (replacing [LOCAL DIRECTORY]
with the directory you wish to install the package into).
git clone https://github.com/AlexLipp/POOPy.git [LOCAL DIRECTORY]
pip install .
The package requires standard scientific Python packages (e.g. numpy
, pandas
, matplotlib
) as well as the following packages:
- GDAL - Required to manipulate geospatial datasets.
- pytest - For running the test suite [optional, see Testing].
To access the data for the following water companies, you will need to obtain API keys from the relevant water company by registering with their developer portal:
From these portals you will obtain client_id
and client_secret
keys which are required to access the datasets. POOPy
will look for these keys in the environment variables of your system. Specifically, it will look for the following variables which must be set in your system environment:
Key | Environment Variable |
---|---|
Thames Water client ID | TW_CLIENT_ID |
Thames Water 'secret' ID | TW_CLIENT_SECRET |
How to set these environment variables will depend on your operating system. For example, on a Unix-based system, you could add the following lines to your .bashrc
or .bash_profile
file:
export TW_CLIENT_ID="your_client_id"
export TW_CLIENT_SECRET="your_client_secret"
A test script is provided in the tests
folder. To run the tests, you will need to install the pytest
package. If installed, the tests can be run from the command line by navigating to the folder in which the package is installed and simply running the command:
pytest --disable-warnings
This will run the tests and provide a summary of the results. If all tests pass, the package has been installed correctly and behaving as expected. Note that the --disable-warnings
flag is used to suppress the many warnings that POOPy
generates, these are mostly informative rather than disastrous (e.g., indicating when an input data-stream is ambiguous), but can be overwhelming en masse.
Once installed, the package can be imported into Python scripts using standard import commands, such as:
import poopy
or
from poopy.companies import ThamesWater
Examples of how to use the package (using the ThamesWater
class as an example) are given in the examples
folder in the form of interactive python Jupyter noteboooks. Note that whilst ThamesWater
is used as an example, the same operations apply to all of the water companies supported by POOPy
(with the exception of the historical data operations which are currently only supported by Thames Water):
- Investigating the current status of sewer overflow spilling
- Investigating the historical status of sewer overflow spilling
POOPy
can be used to be really easily make figures like...
... this one showing the stretches of the Thames downstream of active sewage discharges at the shown time...
from poopy.companies import ThamesWater
tw=ThamesWater()
tw.plot_current_status()
...or this one which shows the discharge history of a specific monitor...
from poopy.companies import ThamesWater
tw=ThamesWater()
monitor=tw.active_monitors["Bourton-On-The_water"]
monitor.plot_history()
...or this one which shows the number of live monitors deployed by Thames Water through time and whether they were discharging...
If you use these scripts, or the data, please reference its source. For instance:
"Data generated using the POOPy software (
github.com/AlexLipp/POOPy
)"
Whilst every effort has been made to ensure accuracy, this is experimental software and errors may occur and I assume no responsibility or liability for any such errors. If you see any issues please contact me, or raise an `Issue' above. This code is licensed under the GNU General Public License 3.0