Skip to content

Latest commit

 

History

History
360 lines (267 loc) · 12.6 KB

README.md

File metadata and controls

360 lines (267 loc) · 12.6 KB

ecoinvent_interface

PyPI Status Python Version

This is an unofficial and unsupported Python library to get ecoinvent data.

Warning

😯 Versions older than 3.0 of ecoinvent-interface will stop working as of February 2025. Please update as soon as possible ❗

Quickstart

from ecoinvent_interface import Settings, EcoinventRelease, ReleaseType
my_settings = Settings(username="John.Doe", password="example")
release = EcoinventRelease(my_settings)
release.list_versions()
>>> ['3.9.1', '3.9', '3.8', '3.7.1', ...]
release.list_system_models('3.7.1')
>>> ['cutoff', 'consequential', 'apos']
release.get_release(version='3.7.1', system_model='apos', release_type=ReleaseType.ecospold)
>>> PosixPath('/Users/JohnDoe/Library/Application Support/'
              'EcoinventRelease/cache/ecoinvent 3.7.1_apos_ecoSpold02')

The ecospold files are downloaded and extracted automatically.

Usage

Authentication via Settings object

Authentication is done via the Settings object. Accessing ecoinvent requires supplying a username and password.

Note that you must accept the ecoinvent license and personal identifying information agreement on the website before using your user account via this library.

You can provide credentials in three ways:

  • Manually, via arguments to the Settings object instantiation:
from ecoinvent_interface import Settings
my_settings = Settings(username="bob", password="example")
  • Via the EI_PASSWORD and EI_USERNAME environment variables
export EI_USERNAME=bob
export EI_PASSWORD=example

If your environment variable values have special characters, using single quotes should work, e.g. export EI_PASSWORD='compl\!cat$d'.

Followed by:

from ecoinvent_interface import Settings
# Environment variables read automatically
my_settings = Settings()
from ecoinvent_interface import Settings, permanent_setting
permanent_setting("username", "bob")
permanent_setting("password", "example")
# Secrets files read automatically
my_settings = Settings()

Secrets files are stored in ecoinvent_interface.storage.secrets_dir.

For each value, manually set values always take precedence over environment variables, which in turn take precendence over secrets files.

A reasonable guide for choosing between the three is to use secrets on your private, local machine, and to use environment variables on servers or containers.

EcoinventRelease interface

To interact with the ecoinvent website, instantiate EcoinventRelease.

from ecoinvent_interface import EcoinventRelease, Settings, ReleaseType
my_settings = Settings()
ei = EcoinventRelease(my_settings)

Database releases

To get a database release, we need to make three selections. First, the version:

ei.list_versions()
>>> ['3.9.1', '3.9', '3.8', '3.7.1', ...]

Second, the system model:

ei.list_system_models('3.7.1')
>>> ['cutoff', 'consequential', 'apos']

The ecoinvent API uses a short and long form of the system model names; you can get the longer names by passing translate=False. You can use either form in all EcoinventRelease methods.

ei.list_system_models('3.7.1', translate=False)
>>> [
  'Allocation cut-off by classification',
  'Substitution, consequential, long-term',
  'Allocation at the Point of Substitution'
]

Finally, the type of release. These are stored in an Enum. There are six release types; if you just want the database to do calculations choose the ecospold type.

  • ReleaseType.ecospold: The single-output unit process files in ecospold2 XML format
  • ReleaseType.matrix: The so-called "universal matrix export"
  • ReleaseType.lci: LCI data in ecospold2 XML format
  • ReleaseType.lcia: LCIA data in ecospold2 XML format
  • ReleaseType.cumulative_lci: LCI data in Excel
  • ReleaseType.cumulative_lcia: LCIA data in Excel

See the ecoinvent website for more information on what these values mean.

Once we have made a selection for all three choices, we can get the release files. They are saved to a cache directory and extracted by default.

ei.get_release(version='3.7.1', system_model='apos', release_type=ReleaseType.matrix)
>>> PosixPath('/Users/JohnDoe/Library/Application Support/'
              'EcoinventRelease/cache/universal_matrix_export_3.7.1_apos')

The default cache uses platformdirs, and the directory location is OS-dependent. You can use a custom cache directory with by specifying output_dir when creating the Settings class instance.

You can work with the cache when offline:

cs = CachedStorage()
list(cs.catalogue)
>>> ['ecoinvent 3.7.1_LCIA_implementation.7z']
cs.catalogue['ecoinvent 3.7.1_LCIA_implementation.7z']
>>> {
  'path': '/Users/<your username>/Library/Application Support/'
          'EcoinventRelease/cache/ecoinvent 3.7.1_LCIA_implementation',
  'extracted': True,
  'created': '2023-09-03T20:23:57.186519'
}

EcoinventRelease extra files

There are two other kinds of files available: reports, and what we call extra files. Let's see the extra files for version '3.7.1':

ei.list_extra_files('3.7.1')
>>>  {'ecoinvent 3.7.1_LCIA_implementation.7z': {
    'uuid': ...,
    'size': ...,
    'modified': datetime.datetime(2023, 4, 25, 0, 0)
  },
  ...
}

This returns a dictionary of filenames and metadata. We can download the ecoinvent 3.7.1_LCIA_implementation.7z file; by default it will automatically be extracted.

ei.get_extra(version='3.7.1', filename='ecoinvent 3.7.1_LCIA_implementation.7z')
>>> PosixPath('/Users/<your username>/Library/Application Support'
              '/EcoinventRelease/cache/ecoinvent 3.7.1_LCIA_implementation')

EcoinventRelease reports

Reports require a login but not a version number:

ei.list_report_files()
>>> {
  'Allocation, cut-off, EN15804_documentation.pdf': {
    'uuid': ...,
    'size': ...,
    'modified': datetime.datetime(2021, 10, 1, 0, 0),
    'description': ('This document provides a documentation on the calculation '
                    'of the indicators in the “Allocation, cut-off, EN15804” '
                    'system model.')
  }
}

Downloading follows the same pattern as before:

ei.get_report('Allocation, cut-off, EN15804_documentation.pdf')
>>> PosixPath('/Users/<your username>/Library/Application Support/EcoinventRelease/cache/Allocation, cut-off, EN15804_documentation.pdf')

Zip and 7z files are extracted by default.

EcoinventProcess interface

This class gets data and reports for specific processes. It first needs to know what release version and system model to work with:

from ecoinvent_interface import EcoinventProcess, Settings
my_settings = Settings()
ep = EcoinventProcess(my_settings)
ep.set_release(version="3.7.1", system_model="apos")

Finding a dataset id

The ecoinvent API uses integer indices, and these values aren't found in the release values. We have cached these indices for versions 3.5, 3.6, 3.7.1, 3.8, 3.9.1, and 3.10. If you already know the integer index, you can use that:

ep.select_process(dataset_id="1")

You can also use the filename, if you know it:

F = "b0eb27dd-b87f-4ae9-9f69-57d811443a30_66c93e71-f32b-4591-901c-55395db5c132.spold"
ep.select_process(filename=F)
ep.dataset_id
>>> "1"

Finally, you can pass in a set of attributes. You should use the name, reference product, and/or location to uniquely identify a process. You don't need to give all attributes, but will get an error if the attributes aren't specific enough.

attributes is a dictionary, and can take the following keys:

  • name or activity_name
  • reference product or reference_product
  • location or geography

The system will adapt the names as needed to find a match.

ep.select_process(
    attributes={
        "name": "rye seed production, Swiss integrated production, for sowing",
        "location": "CH",
        "reference product": "rye seed, Swiss integrated production, for sowing",
    }
)
ep.dataset_id
>>> "40"

Dataset id values are the same across all system models and versions; however, not every system model or version will have all datasets.

Basic process information

Once you have selected the process, you can get basic information about that process:

ep.get_basic_info()
>>> {
  'index': 1,
  'version': '3.7.1',
  'system_model': 'apos',
  'activity_name': 'electricity production, nuclear, boiling water reactor',
  'geography': 'FI',
  'reference_product': 'electricity, high voltage',
  'has_access': True
}

You can also call ep.get_documentation() to get a representation of the ecospold2 XML file in Python.

Process documents

You can use ep.get_file with one of the following file types to download process files:

  • ProcessFileType.upr: Unit Process ecospold XML
  • ProcessFileType.lci: Life Cycle Inventory ecospold XML
  • ProcessFileType.lcia: Life Cycle Impact Assessment ecospold XML
  • ProcessFileType.pdf: PDF Dataset Report
  • ProcessFileType.undefined: Undefined (unlinked and multi-output) Dataset PDF Report

For example, running the following would download the life cycle impact assessment ecospold XML file to the current working directory. The get_file method requires specifying the directory.

from ecoinvent_interface import ProcessFileType
from pathlib import Path
ep.get_file(file_type=ProcessFileType.lcia, directory=Path.cwd())

Here is a complete example for downloading a PDF dataset report:

from ecoinvent_interface import EcoinventProcess, Settings, ProcessFileType
from pathlib import Path
my_settings = Settings()
ep = EcoinventProcess(my_settings)
ep.set_release(version="3.7.1", system_model="apos")
ep.select_process(
    attributes={
        "name": "rye seed production, Swiss integrated production, for sowing",
        "location": "CH",
        "reference product": "rye seed, Swiss integrated production, for sowing",
    }
)
ep.get_file(file_type=ProcessFileType.pdf, directory=Path.cwd())

Relationship to EIDL

This library initially started as a fork of EIDL, the ecoinvent downloader. As of version 2.0, it has been completely rewritten. Currently only the authentication code comes from EIDL.

Differences with EIDL:

  • Designed to be a lower-level infrastructure library. All user and web browser interaction was removed.
  • Username and password can be specified using pydantic_settings.
  • Can download all release files, plus reports and "extra" files.
  • Will autocorrect filenames when possible for ecoinvent inconsistencies.
  • Can download data on inventory processes.
  • Can find inventory processes using their filename or attributes.
  • Uses a more robust caching and cache validation strategy.
  • More reasonable token refresh strategy.
  • No HTML parsing or filename string hacks.
  • Streaming downloads.
  • Descriptive logging and error messages.
  • No shortcuts for Brightway or other LCA software.
  • Custom library headers are set to allow users of this library to be identified. No user information is transmitted.
  • Comprehensive tests.

Contributing

Contributions are very welcome, but please note the following:

  • This library consumes and unpublished an under development API
  • Extensions of the current API to get process LCI or LCIA data or LCIA scores won't be included
  • Brightway-specific code won't be included

To learn more, see the Contributor Guide.

License

Distributed under the terms of the MIT license, ecoinvent_interface is free and open source software.

Issues

If you encounter any problems, please file an issue along with a detailed description.