Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP TarCompression #24211

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

zackattackz
Copy link
Contributor

This is a WIP commit to add tar compression options to containers lib. Will need to be split into several commits across multiple repos.

This is a proposed feature for #23192

I wanted to share this current WIP though to get some feedback, before I went about the process of splitting this into different PRs and making / fixing tests.

The overall idea is to add two new options, TarCompressionFormat / --tar-compression-format and TarCompressionLevel / --tar-compression-level to be used throughout podman's push, save, and scp subcommands. For push and save, the options are passed to specific transport implementations, in this case the oci-archive and docker-archive transports. They both have different implementation details for enabling compression of the resulting tar file.

For the scp subcommand, it is already using save under the hood, so to enable compression we would just pass the new options to the save subcommand. This would only work with remote transfers, but this is fine because I don't think(?) there should be a need for compression on local transfers.

I took care to ensure that everything is backwards compatible, however the only one place that couldn't be backwards compatible would be the change to the ImageEngine interface's Scp function.

This change would be needed to allow for specifying compression options to Scp. But, more broadly, adding an options type here would be great for any future changes to SCP options to allow backwards compatibility.

Other than that, I would think everything else is compatible (could be wrong though).

Another change worth noting is in common/libimage/save.go and common/libimage/copier.go.

In save.go, it was necessary to use the system context AFTER the copy options are set into it. This would normally happen if you used NewCopier().systemContext, but in this case a writer is created before the copier is created, and it doesn't get any of the copy options passed into it. So to solve this I just made new functions in copier.go Runtime.newScWithCopyOpts and ScWithCopyOpts. These methods just abstract away the part of overwriting system context opts with copy opts when NewCopier is called

Also worth noting is the potential confusion between these new options and the existing Compression related options. The existing options were designed explicitly for compression layers in docker dir transport, not with any other kind of compression, so I thought it made more sense and was safer to create new options than to repurpose the exiting ones.

As far as what needs to be done: Of course tests will need to be done. Also, the scp command right now just has a forced compression of gzip, but I need to make the command line options for it (they already are working for save and push though). There probably should be checks added to ensure the options are used correctly. For instance, the --compress flag should probably never be allowed to be used with --tar-compression-format. Finally, a --tar-compress bool flag could be added to just enable gzip format by default.

I appreciate any feedback! I will move forward with trying to separate this into multiple PRs soon.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 8, 2024
Copy link
Contributor

openshift-ci bot commented Oct 8, 2024

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci bot added the do-not-merge/release-note-label-needed Enforce release-note requirement, even if just None label Oct 8, 2024
Copy link
Contributor

openshift-ci bot commented Oct 8, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: zackattackz
Once this PR has been reviewed and has the lgtm label, please assign l0rd for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Luap99
Copy link
Member

Luap99 commented Oct 9, 2024

The overall idea is to add two new options, TarCompressionFormat / --tar-compression-format and TarCompressionLevel / --tar-compression-level to be used throughout podman's push, save, and scp subcommands. For push and save, the options are passed to specific transport implementations, in this case the oci-archive and docker-archive transports. They both have different implementation details for enabling compression of the resulting tar file.

Also worth noting is the potential confusion between these new options and the existing Compression related options. The existing options were designed explicitly for compression layers in docker dir transport, not with any other kind of compression, so I thought it made more sense and was safer to create new options than to repurpose the exiting ones.

That seem rather implementation detail specific where a normal user would not really care in particular when there is already a --compression-format option on the push command so having two similar options is just much more confusing IMO.


Overall you need to start in c/image to design API changes there first that have to get approved by @mtrmac there. I guess overall the question is why is this needed in podman? Anyone (including podman image scp) could just pipe the output into gzip/zstd, etc.... so I don't think these changes would be needed to add compression support for scp.

What I think would be far more interesting in terms of c/image API is to expose a generic writer there containers/common#1275 which would have use cases elsewhere, i.e. the HTTP API. And which such a writer the caller could just pass that into the compression function themselves without having to expose all the compression implementation details into the c/image API.

@zackattackz
Copy link
Contributor Author

Anyone (including podman image scp) could just pipe the output into gzip/zstd, etc.

This is true. I was thinking to show this solution first that tried to reuse the existing logic for compression related stuff, rather than just shell out to another binary. I thought there could be benefits to reusing that logic and also not requiring additional binaries. But I definitely understand if that way would be preferable. In that case, how do you think it should be used? Just a --compress and --compress-command flag for podman image scp? Using --compress would just default to piping to gzip, and --compress-command could be used to specify the exact command + args to use.

I'd be down to try that out if you think it's better.

What I think would be far more interesting in terms of c/image API is to expose a generic writer

Oh I do like that idea far better. I could try to look into that. It does seem like a tall order so can't promise anything, but that would be ideal.

@mtrmac
Copy link
Collaborator

mtrmac commented Oct 9, 2024

First, please https://github.com/containers/podman/blob/main/CONTRIBUTING.md#sign-your-prs ; it’s hard to discuss code with unclear copyright status.

This is a WIP commit to add tar compression options to containers lib. Will need to be split into several commits across multiple repos.

Signed-off-by: Zachary Hanham <[email protected]>
@zackattackz
Copy link
Contributor Author

@mtrmac Apologies, did so now!

@zackattackz
Copy link
Contributor Author

On further thought with shelling out: If I were to, it would probably be best to implement as --encode-command which would encode the data when saved (compress in this case) and a --decode-command to process the data before it is loaded. Would be a little more generic this way, and also would make sense to include an analogue for loading,

@zackattackz
Copy link
Contributor Author

Though an option that would allow running an arbitrary command sounds like it could be a security risk. So will need to consider that.

@zackattackz
Copy link
Contributor Author

I made a new PR at #24234 which is implementing the non-breaking changes to add options types to various scp functions. It takes a lot of code from this one, but is a little different.

@zackattackz zackattackz marked this pull request as draft October 15, 2024 19:41
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 16, 2024
@openshift-merge-robot
Copy link
Collaborator

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/release-note-label-needed Enforce release-note requirement, even if just None do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants