Reproducibility and shareability of notebooks is very important if you want to allow others to repeat your experiments and avoid issues due to dependencies management.
When using pip install <package_name>
is not possible to verify which software stack was used to run the notebook and therefore another user cannot repeat the same experiment.
Dependency management is one of the most important requirements for reproducibility. Having dependencies clearly stated allows portability of notebooks, so they can be shared safely with others, reused in other projects or simply reproduced. If you want to know more about this issue in the data science domain, have a look at this article or this video.
In order to help developers (including data scientists), dependencies for Jupyter notebooks in this tutorial are managed using the JupyterLab extension jupyterlab-requirements.
You can use this extension for each of your notebook to guarantee they have the correct dependencies. This extension is able to add/remove dependencies, lock them and store them in the notebook metadata. In this way all the dependencies information required to repeat the environment are shipped with the notebook.
In particular, the following notebook metadata is created for you, when you use Thoth's dependency management tool:
-
requirements (Pipfile)
; -
requirements lock with all versions and hashes (Pipfile.lock)
; -
dependency resolution engine
used (Thoth or Pipenv); -
.thoth.yaml configuration file
(only for Thoth resolution engine).
All this information can allow reproducibility and shareability of the notebook.
There are 3 ways to interact with jupyterlab-requirements JupyterLab extension:
- using
%horus
magic commands directly in your notebook's cells (preferred approach). To learn more about how to use the%horus
magic commands check out the guide here or the video here
- using the
horus
CLI directly from terminal or integrated in pipelines (check the video or this link if you want to know more about it).
- using the
Manage Dependencies
button that appears in the notebook when it is opened (check the link if you want to know more about it):
NOTE:In this tutorial we will focus on %horus magic commands.
You can consider the use case you are interested in for managing dependencies: