Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow configuring larger max layer size in build time #314

Merged
merged 4 commits into from
Oct 31, 2024

Conversation

2rigor
Copy link
Contributor

@2rigor 2rigor commented Oct 27, 2024

Closes #311

E.g., for 10Gb (10*10^9 bytes) append to go build command:
-ldflags "-w -s -X github.com/anchore/stereoscope/pkg/file.perFileReadLimitStr=10000000000"

@2rigor 2rigor marked this pull request as draft October 27, 2024 14:42
@2rigor 2rigor marked this pull request as ready for review October 27, 2024 15:50
@2rigor
Copy link
Contributor Author

2rigor commented Oct 28, 2024

@wagoodman I don't know how it works. I see you were involved in many PRs. Will you guys automatically have a look at new PR or I should assign it (to whom)? Thanks!

Copy link
Contributor

@kzantow kzantow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @2rigor -- thanks for the contribution. We might like to take a different approach to allow for modifying the per-file read limit. Specifying this at build time doesn't allow configuration at runtime, which is probably the most important spot.

What if instead of making the change as you've implemented it, we just change the perFileReadLimit to a var, and add an exported function to set it; e.g.:

var perFileReadLimit = 2 * GB

func SetPerFileReadLimit(maxBytes int64) {
  // some reasonable validation like maxBytes > 0?
  perFileReadLimit = maxBytes
}

... this allows library users including Syft to configure this value via configuration files at runtime.

@2rigor
Copy link
Contributor Author

2rigor commented Oct 30, 2024

Hey @2rigor -- thanks for the contribution. We might like to take a different approach to allow for modifying the per-file read limit. Specifying this at build time doesn't allow configuration at runtime, which is probably the most important spot.

What if instead of making the change as you've implemented it, we just change the perFileReadLimit to a var, and add an exported function to set it; e.g.:

var perFileReadLimit = 2 * GB

func SetPerFileReadLimit(maxBytes int64) {
  // some reasonable validation like maxBytes > 0?
  perFileReadLimit = maxBytes
}

... this allows library users including Syft to configure this value via configuration files at runtime.

Hi @kzantow.
Thanks for the response and the suggestion. I apologize for the delay
Indeed your suggestion sounds good.
I suggest both leaving the build time option and adding runtime one. I know it's a bit more complicated, but it will allow both configuration ways. And eventually it will be very short.
What do you think?

There is a minor disadvantage of providing an ability to update during runtime - since this variable can be read from multiple goroutines (even if today it is not). I think I saw PR in syft about handling layers in parallel. Practically setting the variable would happen (should happen) before starting the traversal over the image, but having such a function provides an option to change the variable at any time. I still think it's acceptable. Just wanted to share my thoughts.

@2rigor 2rigor force-pushed the larger-layer branch 2 times, most recently from f8d2d5d to 6ee4035 Compare October 30, 2024 19:07
@kzantow
Copy link
Contributor

kzantow commented Oct 30, 2024

but having such a function provides an option to change the variable at any time. I still think it's acceptable. Just wanted to share my thoughts.

Yes, I don't think this is a problem, though -- if the value gets changed, it gets changed for the next reader; I don't think there's a concurrency issue with this. You can set this at initialization time by creating an init() function in your code.

@2rigor
Copy link
Contributor Author

2rigor commented Oct 31, 2024

but having such a function provides an option to change the variable at any time. I still think it's acceptable. Just wanted to share my thoughts.

Yes, I don't think this is a problem, though -- if the value gets changed, it gets changed for the next reader; I don't think there's a concurrency issue with this. You can set this at initialization time by creating an init() function in your code.

Got you, I agree it's better. Implemented as you originally suggested

Copy link
Contributor

@kzantow kzantow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- thanks for the contribution @2rigor !

@kzantow kzantow merged commit 2ce1e52 into anchore:main Oct 31, 2024
7 checks passed
@tomersein
Copy link

how can i set this value via syft? @kzantow

@kzantow
Copy link
Contributor

kzantow commented Nov 10, 2024

@tomersein -- there is no way to configure this in Syft at the moment, but you could add a configuration to one of the options that in a PostLoad() hook, reads a value and sets this using the stereoscope hook that was added.

@tomersein
Copy link

hi, the 1st part I think I understand, maybe under catalogers field (because it is related to scan)
but didn't understand the 2nd part, where is the hook of stereoscope and how can i add it? :) @kzantow

@kzantow
Copy link
Contributor

kzantow commented Nov 11, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Configure max layer/file size
3 participants