-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce ADR for dependencies management in Jupyter notebooks #282
Conversation
Signed-off-by: Francesco Murdaca <[email protected]>
Signed-off-by: Francesco Murdaca <[email protected]>
Signed-off-by: Francesco Murdaca <[email protected]>
Pytest Test failed! Click here
|
2 similar comments
Pytest Test failed! Click here
|
Pytest Test failed! Click here
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be a good approach to deal with dependencies across the pipfile and json.
Pytest Test failed! Click here
|
2 similar comments
Pytest Test failed! Click here
|
Pytest Test failed! Click here
|
Pytest Test failed! Click here
|
This seems to be a good decision, as it supports humans and cyborg and links experiments in notebooks with the corresponding software stacks |
/lgtm |
In order to allow any user to re run the notebook with similar behaviour, it's important that each notebook is shipped with dependencies requirements | ||
that include direct and transitive dependencies. This would also enforce and support security, reproducibility, traceability. | ||
|
||
Each notebook should be treated as single component/service that uses its own dependencies, therefore when storing notebooks, they should be stored each in a specific repo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this suggesting that each notebook should be stored in its own repo? This does not sound like a practical approach to me for many projects. But let me know if I'm misunderstanding the use case here : )
Is the assumption being made here that each notebook in a project would have such different dependencies that loading the shared set for the entire project for each notebook would be wasteful or that there would be some incompatibilities when building an image from it? In my experience most of the packages are shared and used across multiple notebooks in a project. Perhaps notebooks would have unique dependencies if we had one notebook for collecting and processing data, one for training a model, and one for serving inferences and reporting metrics. (but there would still be data and model dependencies they would need to share, and I don't think there would be any incompatibilities).
And I don't disagree that this approach can ensures reproducibility per notebook, but I'm not convinced the complexity associated with breaking a project up into multiple repos per notebook, out-weighs the benefit of absolute reproducibility.
How is this dependency issue managed for pure python projects? I assume there is not a unique repo and requirements for each *.py file. : )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this suggesting that each notebook should be stored in its own repo? This does not sound like a practical approach to me for many projects. But let me know if I'm misunderstanding the use case here : )
Maybe I should say directory
sorry, not repo, what I meant is, if we use aicoe-aiops templates, in the notebooks directory, instead of having all notebooks directly, there are more subdirectories, each with notebook and dependencies.
This would enforce the reproducibility and also complexity in the notebook. If you think that you could have optimized images for each of the step: for example pre processing using spark, training with Tensorflow on GPU, deployment with Seldon using Edge device. Similar to python projects, is better to break in simple pieces for maintainability, readability, reduce testing of single parts and we can enforce for optimization purposes for example.
The software stack for each notebook would be smalled, build time would be smaller, images would contain less dependencies, less risk of incompatibilities. Instead of having a large monolitic AI project, we have separated the different tasks in different software stacks.
Is the assumption being made here that each notebook in a project would have such different dependencies that loading the shared set for the entire project for each notebook would be wasteful or that there would be some incompatibilities when building an image from it? In my experience, most of the packages are shared and used across multiple notebooks in a project.
Perhaps notebooks would have unique dependencies if we had one notebook for collecting and processing data, one for training a model, and one for serving inferences and reporting metrics. (but there would still be data and model dependencies they would need to share, and I don't think there would be any incompatibilities).
We should enforce creating a context for each notebook/step. Once you finished processing, you store inputs for training for example. You don't need a library to process more maybe in some cases (just assuming some cases where this can be done easily). You just create another step for post processing. This would be enforced also by template notebooks as we discussed before.
And I don't disagree that this approach can ensures reproducibility per notebook, but I'm not convinced the complexity associated with breaking a project up into multiple repos per notebook, out-weighs the benefit of absolute reproducibility.
I think it can be double, although consideration on splitting software stacks into smaller pieces I think can be also important as mentioned above, maybe we can let the user decide, with jupyter-nbrequirement we could have a parameter setting default place for dependencies that can be changed by the user if necessary: #276
How is this dependency issue managed for pure python projects? I assume there is not a unique repo and requirements for each *.py file. : )
Right, usually someone expects one single Pipfile/Pipfile.lock as we try to enforce in all repositories, but for some cases, like in ML case I would say it depends how complex is the application and how complex each task is, in general for reasons above mentioned might be more interesting to split and maintain. thoth-station/thamos#464
But just one prospective :) Thanks for the reviews @MichaelClifford
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this suggesting that each notebook should be stored in its own repo? This does not sound like a practical approach to me for many projects. But let me know if I'm misunderstanding the use case here : )
Maybe I should say
directory
sorry, not repo, what I meant is, if we use aicoe-aiops templates, in the notebooks directory, instead of having all notebooks directly, there are more subdirectories, each with notebook and dependencies.
This would enforce the reproducibility and also complexity in the notebook. If you think that you could have optimized images for each of the step: for example pre processing using spark, training with Tensorflow on GPU, deployment with Seldon using Edge device. Similar to python projects, is better to break in simple pieces for maintainability, readability, reduce testing of single parts and we can enforce for optimization purposes for example.The software stack for each notebook would be smalled, build time would be smaller, images would contain less dependencies, less risk of incompatibilities. Instead of having a large monolitic AI project, we have separated the different tasks in different software stacks.
I'm also not very happy about having a dir per notebook. It can easily explode in unnecessary dir traversals and hard to maintain git structure.
Is the assumption being made here that each notebook in a project would have such different dependencies that loading the shared set for the entire project for each notebook would be wasteful or that there would be some incompatibilities when building an image from it? In my experience, most of the packages are shared and used across multiple notebooks in a project.
Perhaps notebooks would have unique dependencies if we had one notebook for collecting and processing data, one for training a model, and one for serving inferences and reporting metrics. (but there would still be data and model dependencies they would need to share, and I don't think there would be any incompatibilities).We should enforce creating a context for each notebook/step. Once you finished processing, you store inputs for training for example. You don't need a library to process more maybe in some cases (just assuming some cases where this can be done easily). You just create another step for post processing. This would be enforced also by template notebooks as we discussed before.
And I don't disagree that this approach can ensures reproducibility per notebook, but I'm not convinced the complexity associated with breaking a project up into multiple repos per notebook, out-weighs the benefit of absolute reproducibility.
I think it can be double, although consideration on splitting software stacks into smaller pieces I think can be also important as mentioned above, maybe we can let the user decide, with jupyter-nbrequirement we could have a parameter setting default place for dependencies that can be changed by the user if necessary: #276
Note splitting software stacks does not need to result in a better complexity. Pre-built container images with all the dependencies shipped (even though the software stack is not minimal) might result in faster response time and less user time spent on installing dependencies when opening a jupyter notebook.
Hence I see two aspects of this:
-
using jupyter-nbrequirements for managing dependencies in notebooks - this is easy to bootstrap and easy to start an experiment with. I as data scientist open a jupyter notebook and start my experiments, I install whatever software is needed to experiment with data. jupyter-nbrequirements should keep track of these dependencies
-
maintaining base container images with pre-built software stacks - this is something we can operate on - we can build and provide container images with specific set of dependencies. Users can select notebook with specific software (e.g. tensorflow+cuda) and run experimnts (as done now on ODH).
The first story will need some work to integrate easily. Managing dependencies directly in jupyter notebook JSON files is not a nice solution, managing Pipfile/Pipfile.lock+.thoth.yaml in a separate directory per notebook does not sound as nice UX neither.
jupyter-nbrequirements can still work with notebook requirements as done now. The story I see here:
If I, as a data-scientist, open a notebook, I use jupyter-nbrequirements to manage my dependencies. jupyter-nbrequirements keeps track of dependencies inside jupyternotebooks as metadata for reproducibility. It can export them to Pipfile/Pipfile.lock if user requests so, but explicitly. Otherwise it should act just as a thin client to talk to pypi/thoth to resolve and install software. Once the work is done, deps can be exported. To save devs time, a container image can be built with the desired set of dependencies (using exported pipfile+thoth.yaml that is managed inside a git repo).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this suggesting that each notebook should be stored in its own repo? This does not sound like a practical approach to me for many projects. But let me know if I'm misunderstanding the use case here : )
Maybe I should say
directory
sorry, not repo, what I meant is, if we use aicoe-aiops templates, in the notebooks directory, instead of having all notebooks directly, there are more subdirectories, each with notebook and dependencies.
This would enforce the reproducibility and also complexity in the notebook. If you think that you could have optimized images for each of the step: for example pre processing using spark, training with Tensorflow on GPU, deployment with Seldon using Edge device. Similar to python projects, is better to break in simple pieces for maintainability, readability, reduce testing of single parts and we can enforce for optimization purposes for example.
The software stack for each notebook would be smalled, build time would be smaller, images would contain less dependencies, less risk of incompatibilities. Instead of having a large monolitic AI project, we have separated the different tasks in different software stacks.I'm also not very happy about having a dir per notebook. It can easily explode in unnecessary dir traversals and hard to maintain git structure.
Is the assumption being made here that each notebook in a project would have such different dependencies that loading the shared set for the entire project for each notebook would be wasteful or that there would be some incompatibilities when building an image from it? In my experience, most of the packages are shared and used across multiple notebooks in a project.
Perhaps notebooks would have unique dependencies if we had one notebook for collecting and processing data, one for training a model, and one for serving inferences and reporting metrics. (but there would still be data and model dependencies they would need to share, and I don't think there would be any incompatibilities).We should enforce creating a context for each notebook/step. Once you finished processing, you store inputs for training for example. You don't need a library to process more maybe in some cases (just assuming some cases where this can be done easily). You just create another step for post processing. This would be enforced also by template notebooks as we discussed before.
And I don't disagree that this approach can ensures reproducibility per notebook, but I'm not convinced the complexity associated with breaking a project up into multiple repos per notebook, out-weighs the benefit of absolute reproducibility.
I think it can be double, although consideration on splitting software stacks into smaller pieces I think can be also important as mentioned above, maybe we can let the user decide, with jupyter-nbrequirement we could have a parameter setting default place for dependencies that can be changed by the user if necessary: #276
Note splitting software stacks does not need to result in a better complexity. Pre-built container images with all the dependencies shipped (even though the software stack is not minimal) might result in faster response time and less user time spent on installing dependencies when opening a jupyter notebook.
Hence I see two aspects of this:
- using jupyter-nbrequirements for managing dependencies in notebooks - this is easy to bootstrap and easy to start an experiment with. I as data scientist open a jupyter notebook and start my experiments, I install whatever software is needed to experiment with data. jupyter-nbrequirements should keep track of these dependencies
- maintaining base container images with pre-built software stacks - this is something we can operate on - we can build and provide container images with specific set of dependencies. Users can select notebook with specific software (e.g. tensorflow+cuda) and run experiments (as done now on ODH).
if using JupyterHub yes, but with Elyra things will be different, you have an AI pipeline where each notebook or python script is a step and for each step you need to select a runtime to be used once running the AI pipeline. In this case, I can choose not only images existing on ODH, but my own images created and available on some registry for example to run my specific step, maybe we built an optimized image for deployment or some specific image optimized by Thoth for performance for one step (training), which maybe is in conflict with one step which requires dask for heavy data processing on huge datasets to use on a remote cluster (something that can be done with Elyra and Kubeflow pipeline as well or if planned to use Jupyter Enterprise Gateway).
The first story will need some work to integrate easily. Managing dependencies directly in jupyter notebook JSON files is not a nice solution, managing Pipfile/Pipfile.lock+.thoth.yaml in a separate directory per notebook does not sound as nice UX neither.
jupyter-nbrequirements can still work with notebook requirements as done now. The story I see here:
If I, as a data-scientist, open a notebook, I use jupyter-nbrequirements to manage my dependencies. jupyter-nbrequirements keeps track of dependencies inside jupyternotebooks as metadata for reproducibility. It can export them to Pipfile/Pipfile.lock if user requests so, but explicitly. Otherwise it should act just as a thin client to talk to pypi/thoth to resolve and install software. Once the work is done, deps can be exported. To save devs time, a container image can be built with the desired set of dependencies (using exported pipfile+thoth.yaml that is managed inside a git repo).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this suggesting that each notebook should be stored in its own repo? This does not sound like a practical approach to me for many projects. But let me know if I'm misunderstanding the use case here : )
Maybe I should say
directory
sorry, not repo, what I meant is, if we use aicoe-aiops templates, in the notebooks directory, instead of having all notebooks directly, there are more subdirectories, each with notebook and dependencies.
This would enforce the reproducibility and also complexity in the notebook. If you think that you could have optimized images for each of the step: for example pre processing using spark, training with Tensorflow on GPU, deployment with Seldon using Edge device. Similar to python projects, is better to break in simple pieces for maintainability, readability, reduce testing of single parts and we can enforce for optimization purposes for example.
The software stack for each notebook would be smalled, build time would be smaller, images would contain less dependencies, less risk of incompatibilities. Instead of having a large monolitic AI project, we have separated the different tasks in different software stacks.I'm also not very happy about having a dir per notebook. It can easily explode in unnecessary dir traversals and hard to maintain git structure.
Is the assumption being made here that each notebook in a project would have such different dependencies that loading the shared set for the entire project for each notebook would be wasteful or that there would be some incompatibilities when building an image from it? In my experience, most of the packages are shared and used across multiple notebooks in a project.
Perhaps notebooks would have unique dependencies if we had one notebook for collecting and processing data, one for training a model, and one for serving inferences and reporting metrics. (but there would still be data and model dependencies they would need to share, and I don't think there would be any incompatibilities).We should enforce creating a context for each notebook/step. Once you finished processing, you store inputs for training for example. You don't need a library to process more maybe in some cases (just assuming some cases where this can be done easily). You just create another step for post processing. This would be enforced also by template notebooks as we discussed before.
And I don't disagree that this approach can ensures reproducibility per notebook, but I'm not convinced the complexity associated with breaking a project up into multiple repos per notebook, out-weighs the benefit of absolute reproducibility.
I think it can be double, although consideration on splitting software stacks into smaller pieces I think can be also important as mentioned above, maybe we can let the user decide, with jupyter-nbrequirement we could have a parameter setting default place for dependencies that can be changed by the user if necessary: #276
Note splitting software stacks does not need to result in a better complexity. Pre-built container images with all the dependencies shipped (even though the software stack is not minimal) might result in faster response time and less user time spent on installing dependencies when opening a jupyter notebook.
Hence I see two aspects of this:
- using jupyter-nbrequirements for managing dependencies in notebooks - this is easy to bootstrap and easy to start an experiment with. I as data scientist open a jupyter notebook and start my experiments, I install whatever software is needed to experiment with data. jupyter-nbrequirements should keep track of these dependencies
- maintaining base container images with pre-built software stacks - this is something we can operate on - we can build and provide container images with specific set of dependencies. Users can select notebook with specific software (e.g. tensorflow+cuda) and run experiments (as done now on ODH).
if using JupyterHub yes, but with Elyra things will be different, you have an AI pipeline where each notebook or python script is a step and for each step you need to select a runtime to be used once running the AI pipeline. In this case, I can choose not only images existing on ODH, but my own images created and available on some registry for example to run my specific step, maybe we built an optimized image for deployment or some specific image optimized by Thoth for performance for one step (training), which maybe is in conflict with one step which requires dask for heavy data processing on huge datasets to use on a remote cluster (something that can be done with Elyra and Kubeflow pipeline as well or if planned to use Jupyter Enterprise Gateway).
Does the runtime environment need to specified in the jupyter notebook itself? Can we use runtime autodiscovery for this as done in thamos?
BTW handling requirements could be also discussed with Jupyter upstream. They could be interested in this functionality to provide a better notebook experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this suggesting that each notebook should be stored in its own repo? This does not sound like a practical approach to me for many projects. But let me know if I'm misunderstanding the use case here : )
Maybe I should say
directory
sorry, not repo, what I meant is, if we use aicoe-aiops templates, in the notebooks directory, instead of having all notebooks directly, there are more subdirectories, each with notebook and dependencies.
This would enforce the reproducibility and also complexity in the notebook. If you think that you could have optimized images for each of the step: for example pre processing using spark, training with Tensorflow on GPU, deployment with Seldon using Edge device. Similar to python projects, is better to break in simple pieces for maintainability, readability, reduce testing of single parts and we can enforce for optimization purposes for example.
The software stack for each notebook would be smalled, build time would be smaller, images would contain less dependencies, less risk of incompatibilities. Instead of having a large monolitic AI project, we have separated the different tasks in different software stacks.I'm also not very happy about having a dir per notebook. It can easily explode in unnecessary dir traversals and hard to maintain git structure.
Is the assumption being made here that each notebook in a project would have such different dependencies that loading the shared set for the entire project for each notebook would be wasteful or that there would be some incompatibilities when building an image from it? In my experience, most of the packages are shared and used across multiple notebooks in a project.
Perhaps notebooks would have unique dependencies if we had one notebook for collecting and processing data, one for training a model, and one for serving inferences and reporting metrics. (but there would still be data and model dependencies they would need to share, and I don't think there would be any incompatibilities).We should enforce creating a context for each notebook/step. Once you finished processing, you store inputs for training for example. You don't need a library to process more maybe in some cases (just assuming some cases where this can be done easily). You just create another step for post processing. This would be enforced also by template notebooks as we discussed before.
And I don't disagree that this approach can ensures reproducibility per notebook, but I'm not convinced the complexity associated with breaking a project up into multiple repos per notebook, out-weighs the benefit of absolute reproducibility.
I think it can be double, although consideration on splitting software stacks into smaller pieces I think can be also important as mentioned above, maybe we can let the user decide, with jupyter-nbrequirement we could have a parameter setting default place for dependencies that can be changed by the user if necessary: #276
Note splitting software stacks does not need to result in a better complexity. Pre-built container images with all the dependencies shipped (even though the software stack is not minimal) might result in faster response time and less user time spent on installing dependencies when opening a jupyter notebook.
Hence I see two aspects of this:
- using jupyter-nbrequirements for managing dependencies in notebooks - this is easy to bootstrap and easy to start an experiment with. I as data scientist open a jupyter notebook and start my experiments, I install whatever software is needed to experiment with data. jupyter-nbrequirements should keep track of these dependencies
- maintaining base container images with pre-built software stacks - this is something we can operate on - we can build and provide container images with specific set of dependencies. Users can select notebook with specific software (e.g. tensorflow+cuda) and run experiments (as done now on ODH).
if using JupyterHub yes, but with Elyra things will be different, you have an AI pipeline where each notebook or python script is a step and for each step you need to select a runtime to be used once running the AI pipeline. In this case, I can choose not only images existing on ODH, but my own images created and available on some registry for example to run my specific step, maybe we built an optimized image for deployment or some specific image optimized by Thoth for performance for one step (training), which maybe is in conflict with one step which requires dask for heavy data processing on huge datasets to use on a remote cluster (something that can be done with Elyra and Kubeflow pipeline as well or if planned to use Jupyter Enterprise Gateway).
Does the runtime environment need to specified in the jupyter notebook itself? Can we use runtime autodiscovery for this as done in thamos?
You can create runtimes to be used in Kubeflow pipelines using elyra command line from console or from the UI button for runtimes. Once runtime exist, you can submit a notebook basically
BTW handling requirements could be also discussed with Jupyter upstream. They could be interested in this functionality to provide a better notebook experience.
Thanks @fridex we will open issue.
Signed-off-by: Francesco Murdaca <[email protected]>
Pytest Test failed! Click here
|
Updated |
|
||
* 1. Jupyter notebook without dependencies (no reproducibility) | ||
* 2. Jupyter notebook without dependencies embedded in json file but with Pipfile/Pipfile.lock always present (Jupyter notebook and requirements are decoupled) | ||
* 3. Jupyter notebook with dependencies embedded in json file of the notebook and Pipfile/Pipfile.lock present |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that for a repo, each notebook will live in its own dir with its own pipfile/pipfile.lock as well as having its dependencies embedded in json?
I'll admit I don't fully understand the dependency management process (and @fridex can probably answer this question better 😄 ), but isn't it redundant and potentially error prone to maintain dependencies both in the notebook and as a pipfile? Shouldn't the decision be one or the other? In which case, I think embedded would be the way to go for each notebook, with a single overarching project Pipfile for the whole repo (kinda of how projects are set up currently). Or is that what this Option 3 is saying already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
We should keep dependencies embedded in the notebook all the time. Having them aside is an action that should be triggered explicitly when exporting them, or when importing dependency listing from Pipfile/Pipfile.lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that for a repo, each notebook will live in its own dir with its own pipfile/pipfile.lock as well as having its dependencies embedded in json?
I'll admit I don't fully understand the dependency management process (and @fridex can probably answer this question better ), but isn't it redundant and potentially error prone to maintain dependencies both in the notebook and as a pipfile? Shouldn't the decision be one or the other? In which case, I think embedded would be the way to go for each notebook, with a single overarching project Pipfile for the whole repo (kinda of how projects are set up currently). Or is that what this Option 3 is saying already?
No, as we talked last DS meetup, we decided not to consider that option of one repo per notebook, but thinking of what you and @fridex said, maybe we can restructure in:
Jupyter notebook with dependencies embedded in json file of the notebook that can be optionally extracted.
But what about the main Pipfile/Pipfile.lock? If a work on three different notebooks, createing dependencies for each, they will be different.
If we want to create an image to run those notebooks, there is need for a single Pipfile/Pipfile.lock with the dependencies from all notebooks.
How do we deal with having one single Pipfile/Pipfile.lock and different notebooks, each with their own dependencies?
Maybe notebook 1 required only numpy, pandas and matplolib, but notebook 2 only tensorflow.
Do we need some way that is able to merge them, syncing a common Pipfile/Pipfile.lock that can be used to run them all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we deal with having one single Pipfile/Pipfile.lock and different notebooks, each with their own dependencies?
Maybe notebook 1 required only numpy, pandas and matplolib, but notebook 2 only tensorflow.Do we need some way that is able to merge them, syncing a common Pipfile/Pipfile.lock that can be used to run them all?
Yes, these files are just TOML and JSON files. We have tooling in thoth-python that can merge these files and keep consistency (e.g. check the computed hash, avoid duplicates, ...). The workflow should include Thoth - just Pipfile is created out of the all notebooks and Thoth resolves Pipfile.lock. Thoth part is required as these dependencies can have issues between them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @fridex , I will proceed in this way!! I will update the ADR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jupyter notebook with dependencies embedded in json file of the notebook that can be optionally extracted.
Sounds good to me.
Maybe add a bit more specificity to it? "Jupyter notebook with dependencies embedded in json file of the notebook that can be optionally extracted as a merged Pipfile via Thoth"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will have two options:
-
One in notebook itself, to extract Pipfile/Pipfile.lock from the notebook
-
one other button, might be in the menu under kernels tab, that would look at all notebooks and create a merged Pipfile and Pipfile.lock.
Jupyter notebook with dependencies embedded in json file of the notebook that can be optionally extracted if user wants
If more notebooks are present, a common Pipfile can be created with a button that can automatically extract from all notebook dependencies and new common Pipfile.lock will be created. This would allow the creation of an image that can run the notebooks.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @MichaelClifford @fridex!
Signed-off-by: Francesco Murdaca <[email protected]>
Pytest Test failed! Click here
|
I will add this ADR also to jupyterlab extension if all good for you @MichaelClifford @fridex. See: thoth-station/jupyterlab-requirements#5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: MichaelClifford, pacospace The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Francesco Murdaca [email protected]
@MichaelClifford @sophwats