Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure dependencies / image building #104

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

TomAugspurger
Copy link
Contributor

Description

Updates pctasks' packaging to facilitate easier image building for downstream tasks. Currently, building a Docker image for a dataset requires copy-pasting the "base" dockerfile around, which is cumbersome.

A possible alternative was to have the dataset Dockerfile use a FROM pctasks-base to add its additional dependencies. But this would require multiple pip solves / installs, which can be challenging to ensure you end up with a consistent, valid environment. It's best to do that all at once if possible.

So this adds a new scripts/generate-requirements that uses pip-compile to find all the requirements.

To help avoid breaking the cache too many times, we split that output into dependencies and pctasks modules.

Now our dataset tasks just need to build an image using the same definition. See https://github.com/microsoft/planetary-computer-tasks/compare/tom/fix/packaging-dev?expand=1#diff-49a5b2bd27cc6579e5bc27e44e1b033d8239d83131d28ce42b89a3d26013974aR29

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

Checklist:

Please delete options that are not relevant.

  • I have performed a self-review
  • Documentation has been updated
  • Unit tests pass locally (./scripts/test)
  • Code is linted and styled (./scripts/format)

@TomAugspurger TomAugspurger changed the title Restructure image building Restructure dependencies / image building Dec 20, 2022
pctasks/cli/dev_requirements.txt Show resolved Hide resolved
@@ -0,0 +1,227 @@
#
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure if this file should be committed to git. I think it shouldn't, but would be curious to hear what others think.

For now, I've included it so people can see the output.

@@ -0,0 +1,20 @@
#!/bin/bash
# Usage: ./scripts/generate-requirements [optional paths to additional requirements.txt]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This argument parsing could be improved:

  1. It could maybe take an output file / files for where to write stuff.
  2. It doesn't use any of the fancy argument parsing stuff from the other scripts for things like help, etc.

@TomAugspurger
Copy link
Contributor Author

Test failure was from (some tests) requiring the pctasks CLI within the Task container. Apparently pip install file://path/to/pctasks wasn't getting the console scripts installed properly. With a -e things are seemingly working.

@TomAugspurger TomAugspurger mentioned this pull request Apr 18, 2023
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant