From b8822dd6e25bc778479a165d8c61b46dd8593539 Mon Sep 17 00:00:00 2001 From: Paul Natsuo Kishimoto Date: Thu, 5 Sep 2024 11:36:30 +0200 Subject: [PATCH 1/4] Add/update doc/repro.rst content from message_data - Update and improve text to match current practice. - Add a list and scheme for external model names (#224) --- doc/repro.rst | 255 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 249 insertions(+), 6 deletions(-) diff --git a/doc/repro.rst b/doc/repro.rst index f86d3a5dca..9caa2ea766 100644 --- a/doc/repro.rst +++ b/doc/repro.rst @@ -12,8 +12,142 @@ Elsewhere: - A `high-level introduction `_, to how testing supports validity, reproducibility, interoperability, and reusability, in :mod:`message_ix_models` and related packages. - :doc:`api/testing` (:mod:`message_ix_models.testing`), on a separate page. -Strategy -======== +.. _repro-doc: + +Documentation +============= + +Documentation serves different purposes for completed vs. ongoing work: + +- For *completed* work, the documentation **must** allow a reader to understand what was done, replicate or reproduce results, and/or reuse code and/or data. +- For *ongoing* work, the documentation **must** allow colleagues to locate and understand the state of current and planned work. + +- Every distinct project for which MESSAGEix-GLOBIOM scenarios and outputs are used **must** be included in the documentation. + The contents **may** be brief or extensive; read on. + +- Docs **must** be placed in one of the following locations: + + - :file:`doc/model/{variant}.rst`, or :file:`doc/model/{variant}/index.rst`, or :file:`doc/{variant}/index.rst` if there will be multiple documentation pages for the model variant. + - :file:`doc/project/{name}.rst`, or :file:`doc/project/{name}/index.rst` or :file:`doc/{name}/index.rst` if there will be multiple documentation pages for the project. + + In :mod:`message_data`, some docs have been placed ‘inline’ with the code, for example in: + + - :file:`message_data/model/{variant}/doc.rst` + - :file:`message_data/model/{variant}/doc/index.rst` + - :file:`message_data/project/{name}/doc.rst` + - :file:`message_data/project/{name}/doc/index.rst` + + When code is :doc:`migrated ` from :mod:`message_data`, these files **should** be moved to the :file:`/doc/` directory. + +- One file, usually the main or index file **must** be included in the :code:`.. toctree::` directive in :file:`doc/index.rst`. + +- Extensive documentation for a project or model variant **should** be organized with headings, tables of contents, and if necessary split into several files. + +Ongoing projects +---------------- + +Documentation pages for ongoing projects **must** include a :code:`.. warning::` Sphinx directive at the top of the file indicating the code is under development. +See e.g. :doc:`/transport/index`. +This section **should** contain one or all of: + +- Link(s) to GitHub, including: + + - A current tracking issue, which in turn can link to: + + - Other issues and PRs where work occurs. + - Any of the items below. + + - A project board, if any. + - A label for issues/PRs, if any. + +- Reference to all other locations where work is occurring, including any: + + - Branch(es)—``main``, ``dev``, or any others—in :mod:`message-ix-models` or :mod:`message_data`. + - Fork(s) of these repos. + - Other repository/-ies separate :mod:`message-ix-models` or :mod:`message_data` + + Not that this **does not** imply those should be made public, for instance prior to publication, if there are reasons not to; only that their existence and contents should be mentioned. + +This directive **must** be kept current, and removed once work is complete. + +Documentation for ongoing projects **should** be added to :mod:`message_ix_models`, even if some of the code or linked resources are in :mod:`message_data` or are otherwise private. + +Completed projects +------------------ + +Doc pages for completed projects **must** specify: + +- location(s) of scenarios, e.g. + + - ixmp URLS giving the platform (‘database’), model name, scenario name, *and* version for any scenarios. + These **must** allow a reader to distinguish between ‘main’ or meaningful scenarios and other extras that should not be used. + - Specific external databases, Scenario Explorer instances, etc. + +- data sources, +- reference to code used to prepare data, +- any special parametrization or structure that is different from the RES or a referenced project, and +- complete instructions to run all scenarios related to the project. + +Doc pages for completed projects **should** include a “Summary” section with all relevant items from the following list. +This allows quick/at-a-glance understanding of the model configuration used for a completed project. +These can be described *directly*, or by *reference*, for the latter, write “same as ” and add a ReST link to a full description elsewhere. + +Example summary section +~~~~~~~~~~~~~~~~~~~~~~~ + +Versions + See :ref:`versioning` for a complete discussion of the information to be recorded here. + +Regions + The regional aggregation used in the project. + Refer to one of the :doc:`message_ix_models:pkg-data/node`. + +Structure + The set of technologies, constraints, and other parametrizations. + +Demands + The projected demand for energy and other commodities. + +Trade + International trade. + Mention any special treatment of electricity trade across regions. + +Other items + Include these and add explanatory text if the configuration differs from the base global model: + + - Fossil resources + - Renewable resources and technologies + - Electricity —representation of the electric power sector. + - Other conversion technologies + - Carbon capture and storage (CCS) + - Transport + - Buildings + - Industry + - Land use (GLOBIOM) + - Non-CO₂ GHGs + +Comments + Additional comments or description not fitting into the other fields. + +Publications + Add entries to :file:`doc/main.bib` and use the ``:cite:`` ReST role. + +Other code +---------- + +- Docstrings for general-purpose code and functions **should** explain clearly to which data (including scenario(s)) the code is *or* is not applicable. + Code **may** also check explicitly and raise informative Python exceptions if the target data/scenario is not supported. + + These allow others to understand when the code: + + - can be (re)used without modification, + - can be modified or extended to support new uses, or + - can or should not be used. + +.. _repro-testing: + +Testing +======= The code in :mod:`.model.bare` generates a “bare” reference energy system. This is a Scenario that has the same *structure* (ixmp 'sets') as actual instances of the MESSAGEix-GLOBIOM global model, but contains no *data* (ixmp 'parameter' values). @@ -23,7 +157,7 @@ Such tests are faster and lighter than testing on fully-populated scenarios, and .. _test-suite: Test suite (:mod:`message_ix_models.tests`) -=========================================== +------------------------------------------- :mod:`message_ix_models.tests` contains a suite of tests written using `Pytest `_. @@ -40,7 +174,7 @@ Each test **should** have a docstring explaining what it checks. tests Running the test suite -====================== +---------------------- Some notes for running the test suite. @@ -63,7 +197,7 @@ In either case: .. _ci: Continuous testing -================== +------------------ The test suite (:mod:`message_ix_models.tests`) is run using GitHub Actions for new commits on the ``main`` branch; new commits on any branch associated with a pull request; and on a daily schedule. These ensure that the code is functional and produces expected outputs, even as upstream dependencies evolve. @@ -74,7 +208,7 @@ Because it is closed-source and requires access to internal IIASA resources, inc .. _export-test-data: Prepare data for testing -======================== +------------------------ Use the ``export-test-data`` CLI command:: @@ -82,3 +216,112 @@ Use the ``export-test-data`` CLI command:: See also the documentation for :func:`export_test_data`. Use the :command:`--exclude`, :command:`--nodes`, and :command:`--techs` options to control the content of the resulting file. + +.. _versioning: + +Versioning and naming +===================== + +The :mod:`message_ix_models` code, as of any commit, can generate many different :mod:`message_ix` scenarios with different structure, parametrization, etc. and solve them in different ways, yielding different results. +In order to uniquely identify scenarios and enable reproduction of their results, users **must** record: + +1. A specific commit of the :mod:`message_ix_models` *code* and (if it is used) the :mod:`message_data` code. + + - These are most easily checked using the command :program:`message-ix show-versions`; copy and store the entire result in a text file. + - Specific `releases of message-ix-models `_ always correspond exactly to a particular commit; giving a release version is sufficient to identify a commit. [1]_ + - Commits **may** be on a branch other than ``main``; however commits on ``main`` receive the most active maintenance and **should** be preferred. + +2. Exact *CLI command(s)* or Python function(s) that is/are run to generate the scenario(s), and +3. Optional *configuration file(s)*. + + These **should** be committed to the repository (1) and **should** be mentioned as command-line arguments (2) as necessary. + Input data sources, and versions thereof, **must** be specified in the same way. +4. The system on which the command(s) (2) were run. + +For example: + + “Using ``message_ix_models 2020.6.21.dev0+g7e59382`` (see also [file] with complete output from :program:`message-ix show-versions`), the command :program:`mix-models --url="ixmp://ene-ixmp/CD_LINKS_SSP2/baseline" transport build` was run on ``hpg914.iiasa.ac.at``.” + +This specifies (1), (2), and (4); since no configuration file is mentioned, then for (3) it is implied that the default configuration file(s) as of this version of :mod:`message_ix_models` are used. + +Any ‘base’ scenario used as a starting point to build other scenarios **must** be specified via one of (3), (2), or (1)—in that order of preference. + +.. [1] Unlike :mod:`ixmp` and :mod:`message_ix`, the packages :mod:`message_ix_models` and :mod:`message_data` *do not* use semantic versioning. + This is because the notion of “(dis)similarity of different MESSAGEix-GLOBIOM parametrizations” does not map to the notion of “software API compatibility” that is the basis of semantic versioning. + +External model names +-------------------- + +The :mod:`ixmp` data model uniquely identifies scenarios by the triple of (model name, scenario name, version). + +In other contexts, “external” model names are used; for instance, in data submitted to model comparison projects using the IAMC data structure—‘version’ is omitted, or not accepted/reassigned by the receiving system. +In these cases, the “external” name: + +- may be different from the ‘internally’ name used in IIASA ECE :mod:`ixmp` databases. +- serves to label and identify MESSAGEix-GLOBIOM model data in contexts where it is compared with other scenarios. +- *does not*, on its own, suffice to identify the materials and steps to reproduce a scenario. + +External model names **must** be recorded as corresponding to specific internal (model name, scenario name, version) identifiers. +This **should** be done by recording scenario URLs. + +External model names **should** follow the scheme ``{name} {version}{postfix}``, for example ``MESSAGEix-GLOBIOM 2.0-R17-BT``, wherein the parts are: + +name + “MESSAGEix-GLOBIOM” by default. + In some cases, certain variants which involve extensive changes to model structure may establish alternate names, for instance :mod:`.model.material`, :mod:`.model.transport`, or :mod:`.model.water`. + +version + is *distinct* from the commit ID, :mod:`message_ix_models` release, etc. + This serves to identify distinct generations of the model or variant identified by ``name``. [2]_ + + This part format follows a loose, reduced form of `semantic versioning `_, such that the first part is incremented for “major” changes and the latter for “minor” changes. + There is no established rule, guideline, or heuristic for what kinds of changes are “minor” or “major”. + Developers **must**: + + - initiate a discussion with colleagues about when to increment either the major or minor part, and + - record (below, or on a variant-specific documentation page) changes associated with an incremented version part. + +postfix + This **should** be omitted if the model structure does not differ from the structure given below for the corresponding ``{name} {version}``. + + It consists of one or more parts, each prefixed with a hyphen. + In order, these **may** include: + + - Node code list, for example ``-R17``, to indicate spatial scope and resolution + - Year (period) list, for example ``-A``, to indicate temporal scope and resolution. + - Variants included, for example ``-M`` if using :mod:`.model.material`, or ``-MT`` if using :mod:`.model.material` and :mod:`.model.transport`. + - Further parts for other structural changes that are not part of a variant with a one-letter code, for example ``-DACCS``. + + Project names or acronyms **should not** be used in the postfix. + The postfix conveys information about the model structure, not about the purpose to which it is applied. + Project information **may** be added to the scenario name. + +.. [2] This means the sequence of ``version`` parts may be different for each ``name``. + For instance, ``MESSAGEix-Materials 1.1`` does not necessarily have any correspondence to ``MESSAGEix-GLOBIOM 1.1``, ``MESSAGEix-GLOBIOM 2.0``, etc. + + +.. _model-names: + +Some external model names include: + +MESSAGEix-GLOBIOM 1.0 + .. todo:: Expand with a list of cases in which this model name has been used. + +MESSAGEix-GLOBIOM 1.1 + This version is published as a :doc:`model-snapshot`, and uses: + + - R11 node list. + - B year (period) list. + + .. todo:: Expand with a list of cases in which this model name has been used. + +MESSAGEix-GLOBIOM 2.0 + Used for the 2023–2024 ScenarioMIP/SSP process. + This configuration uses, by default: + + - R12 node list. + - B year (period) list. + - :mod:`.model.material`. + +MESSAGEix-GLOBIOM 2.0-M-R12-NGFS + Used for the NGFS project round in 2024. From 0e02e3ed1ea7dcf6f42f6230eb872cd983f7afdc Mon Sep 17 00:00:00 2001 From: Paul Natsuo Kishimoto Date: Thu, 5 Sep 2024 12:00:01 +0200 Subject: [PATCH 2/4] Copyedit doc/repro - Add link to doc/data. - Unroll 1-item list in "Other code" section. - Correct link to model-snapshot. - Link to node and year code lists. --- doc/repro.rst | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/doc/repro.rst b/doc/repro.rst index 9caa2ea766..bc7792fef6 100644 --- a/doc/repro.rst +++ b/doc/repro.rst @@ -11,6 +11,7 @@ Elsewhere: - A `high-level introduction `_, to how testing supports validity, reproducibility, interoperability, and reusability, in :mod:`message_ix_models` and related packages. - :doc:`api/testing` (:mod:`message_ix_models.testing`), on a separate page. +- :doc:`data` for information about reproducible handling of data, both private and public. .. _repro-doc: @@ -112,7 +113,7 @@ Trade International trade. Mention any special treatment of electricity trade across regions. -Other items +[other items] Include these and add explanatory text if the configuration differs from the base global model: - Fossil resources @@ -135,14 +136,14 @@ Publications Other code ---------- -- Docstrings for general-purpose code and functions **should** explain clearly to which data (including scenario(s)) the code is *or* is not applicable. - Code **may** also check explicitly and raise informative Python exceptions if the target data/scenario is not supported. +Docstrings for general-purpose code and functions **should** explain clearly to which data (including scenario(s)) the code is *or* is not applicable. +Code **may** also check explicitly and raise informative Python exceptions if the target data/scenario is not supported. - These allow others to understand when the code: +These allow others to understand when the code: - - can be (re)used without modification, - - can be modified or extended to support new uses, or - - can or should not be used. +- can be (re)used without modification, +- can be modified or extended to support new uses, or +- can or should not be used. .. _repro-testing: @@ -308,10 +309,10 @@ MESSAGEix-GLOBIOM 1.0 .. todo:: Expand with a list of cases in which this model name has been used. MESSAGEix-GLOBIOM 1.1 - This version is published as a :doc:`model-snapshot`, and uses: + This version is published as a :doc:`data snapshot `, and uses: - - R11 node list. - - B year (period) list. + - R11 :doc:`node list `. + - B :doc:`year (period) list `. .. todo:: Expand with a list of cases in which this model name has been used. From 13246d6475544c82460346516eb5d2a54483ccfc Mon Sep 17 00:00:00 2001 From: Paul Natsuo Kishimoto Date: Thu, 5 Sep 2024 11:54:07 +0200 Subject: [PATCH 3/4] Add #226 to doc/whatsnew --- doc/whatsnew.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/doc/whatsnew.rst b/doc/whatsnew.rst index 65be8eacfa..8f4900b91e 100644 --- a/doc/whatsnew.rst +++ b/doc/whatsnew.rst @@ -4,6 +4,7 @@ What's new Next release ============ +- Expand :doc:`repro` with sections on :ref:`repro-doc` and :ref:`versioning`, including :ref:`a list of external model names and ‘versions’ ` like “MESSAGEix-GLOBIOM 2.0” (:issue:`224`, :pull:`226`). - Fix and update :doc:`/api/tools-costs` (:pull:`219`) - Fix naming of GDP and population columns in SSP data aggregation. From 6cebe684cb86db6c99988ee392e7baf914e85b21 Mon Sep 17 00:00:00 2001 From: Paul Natsuo Kishimoto Date: Wed, 11 Sep 2024 10:21:26 +0200 Subject: [PATCH 4/4] Copyedit doc/repro.rst per review comments --- doc/repro.rst | 60 +++++++++++++++++++++++++++++---------------------- 1 file changed, 34 insertions(+), 26 deletions(-) diff --git a/doc/repro.rst b/doc/repro.rst index bc7792fef6..c2f2f328cf 100644 --- a/doc/repro.rst +++ b/doc/repro.rst @@ -9,8 +9,8 @@ On this page: Elsewhere: -- A `high-level introduction `_, to how testing supports validity, reproducibility, interoperability, and reusability, in :mod:`message_ix_models` and related packages. -- :doc:`api/testing` (:mod:`message_ix_models.testing`), on a separate page. +- A `high-level introduction `_ to how testing supports validity, reproducibility, interoperability, and reusability in :mod:`message_ix_models` and related packages. +- :doc:`api/testing` (:mod:`message_ix_models.testing`). - :doc:`data` for information about reproducible handling of data, both private and public. .. _repro-doc: @@ -28,8 +28,12 @@ Documentation serves different purposes for completed vs. ongoing work: - Docs **must** be placed in one of the following locations: - - :file:`doc/model/{variant}.rst`, or :file:`doc/model/{variant}/index.rst`, or :file:`doc/{variant}/index.rst` if there will be multiple documentation pages for the model variant. - - :file:`doc/project/{name}.rst`, or :file:`doc/project/{name}/index.rst` or :file:`doc/{name}/index.rst` if there will be multiple documentation pages for the project. + - For a model variant that can be documented on a single page: :file:`doc/model/{variant}.rst` or :file:`doc/{variant}.rst`. + - For a model variant with multiple documentation pages: :file:`doc/model/{variant}/index.rst` or :file:`doc/{variant}/index.rst` + - For a project that can be documented on a single page: :file:`doc/project/{name}.rst` or :file:`doc/{name}.rst` + - For a project with multiple documentation pages: :file:`doc/project/{name}/index.rst` or :file:`doc/{name}/index.rst`. + + In either case, the ``{variant}`` or ``{name}`` **must** match the corresponding Python model name, except for the substitution of hyphens for underscores. In :mod:`message_data`, some docs have been placed ‘inline’ with the code, for example in: @@ -47,9 +51,9 @@ Documentation serves different purposes for completed vs. ongoing work: Ongoing projects ---------------- -Documentation pages for ongoing projects **must** include a :code:`.. warning::` Sphinx directive at the top of the file indicating the code is under development. -See e.g. :doc:`/transport/index`. -This section **should** contain one or all of: +Documentation pages for ongoing projects **must** include a :code:`.. warning::` Sphinx directive at the top of the file indicating the code is under active development. +See for instance :doc:`/transport/index`. +This directive **should** contain one or all of: - Link(s) to GitHub, including: @@ -76,22 +80,22 @@ Documentation for ongoing projects **should** be added to :mod:`message_ix_model Completed projects ------------------ -Doc pages for completed projects **must** specify: +Documentation pages for completed projects **must** specify all of the following. -- location(s) of scenarios, e.g. +- Location(s) of scenario data, e.g. - - ixmp URLS giving the platform (‘database’), model name, scenario name, *and* version for any scenarios. + - :mod:`ixmp` URLS giving the platform (‘database’), model name, scenario name, *and* version for any scenarios. These **must** allow a reader to distinguish between ‘main’ or meaningful scenarios and other extras that should not be used. - Specific external databases, Scenario Explorer instances, etc. -- data sources, -- reference to code used to prepare data, -- any special parametrization or structure that is different from the RES or a referenced project, and -- complete instructions to run all scenarios related to the project. +- Data sources, +- Reference to code used to prepare data, +- Any special parametrization or structure that is different from the RES or a referenced project, and +- Complete instructions to run all workflow(s) and/or scenarios related to the project. -Doc pages for completed projects **should** include a “Summary” section with all relevant items from the following list. +The pages **should** also include a “Summary” section with all relevant items from the following list. This allows quick/at-a-glance understanding of the model configuration used for a completed project. -These can be described *directly*, or by *reference*, for the latter, write “same as ” and add a ReST link to a full description elsewhere. +These can be described *directly*, or by *reference*; for the latter, write “same as ” and add a ReST link to a full description elsewhere. Example summary section ~~~~~~~~~~~~~~~~~~~~~~~ @@ -101,7 +105,7 @@ Versions Regions The regional aggregation used in the project. - Refer to one of the :doc:`message_ix_models:pkg-data/node`. + Refer to one of the :doc:`pkg-data/node`. Structure The set of technologies, constraints, and other parametrizations. @@ -150,10 +154,14 @@ These allow others to understand when the code: Testing ======= -The code in :mod:`.model.bare` generates a “bare” reference energy system. -This is a Scenario that has the same *structure* (ixmp 'sets') as actual instances of the MESSAGEix-GLOBIOM global model, but contains no *data* (ixmp 'parameter' values). -Code that operates on the global model can be tested on the bare RES; if it works on that scenario, this is one indication (necessary, but not always sufficient) that it should work on fully-populated scenarios. -Such tests are faster and lighter than testing on fully-populated scenarios, and make it easier to isolate errors in the code that is being tested. +In addition to atomic/unit tests of individual functions, multiple strategies **may** be used to ensure code works on intended target MESSAGEix-GLOBIOM base scenarios. + +- The code in :mod:`.model.bare` generates a **“bare” reference energy system**. + This is a Scenario that has the same *structure* (ixmp 'sets') as actual instances of the MESSAGEix-GLOBIOM global model, but contains no *data* (ixmp 'parameter' values). + Code that operates on the global model can be tested on this bare RES; if it works on that scenario, this is one indication (necessary, but not always sufficient) that it should work on fully-populated scenarios. +- :doc:`model/snapshot` can be used as target for tests. + +Such tests are faster and lighter than testing on fully-populated scenarios and make it easier to isolate errors in the code that is being tested. .. _test-suite: @@ -258,9 +266,9 @@ The :mod:`ixmp` data model uniquely identifies scenarios by the triple of (model In other contexts, “external” model names are used; for instance, in data submitted to model comparison projects using the IAMC data structure—‘version’ is omitted, or not accepted/reassigned by the receiving system. In these cases, the “external” name: -- may be different from the ‘internally’ name used in IIASA ECE :mod:`ixmp` databases. -- serves to label and identify MESSAGEix-GLOBIOM model data in contexts where it is compared with other scenarios. -- *does not*, on its own, suffice to identify the materials and steps to reproduce a scenario. +- May be different from the ‘internally’ name used in IIASA ECE :mod:`ixmp` databases. +- Serves to label and identify MESSAGEix-GLOBIOM model data in contexts where it is compared with other scenarios. +- *Does not*, on its own, suffice to identify the materials and steps to reproduce a scenario. External model names **must** be recorded as corresponding to specific internal (model name, scenario name, version) identifiers. This **should** be done by recording scenario URLs. @@ -279,8 +287,8 @@ version There is no established rule, guideline, or heuristic for what kinds of changes are “minor” or “major”. Developers **must**: - - initiate a discussion with colleagues about when to increment either the major or minor part, and - - record (below, or on a variant-specific documentation page) changes associated with an incremented version part. + 1. Initiate a discussion with colleagues about when to increment either the major or minor part. + 2. Record (below, or on a variant-specific documentation page) changes associated with an incremented version part. postfix This **should** be omitted if the model structure does not differ from the structure given below for the corresponding ``{name} {version}``.