-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make MAXIMUM_SEED_SIZE_MIB configurable #11177
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
kind: Features | ||
body: Make MAXIMUM_SEED_SIZE_MIB configurable | ||
time: 2023-03-07T13:48:38.792321024Z | ||
custom: | ||
Author: noppaz acurtis-evi | ||
Issue: 7117 7124 |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,6 +17,7 @@ | |
) | ||
from dbt.events.types import InputFileDiffError | ||
from dbt.exceptions import ParsingError | ||
from dbt.flags import get_flags | ||
from dbt.parser.common import schema_file_keys | ||
from dbt.parser.schemas import yaml_from_file | ||
from dbt.parser.search import filesystem_search | ||
|
@@ -123,12 +124,14 @@ | |
|
||
# Special processing for big seed files | ||
def load_seed_source_file(match: FilePath, project_name) -> SourceFile: | ||
if match.seed_too_large(): | ||
# Users can configure the maximum seed size (MiB) that will be hashed for state comparison | ||
maximum_seed_size = get_flags().MAXIMUM_SEED_SIZE_MIB * 1024 * 1024 | ||
# maximum_seed_size = 0 means no limit | ||
if match.file_size() > maximum_seed_size and maximum_seed_size != 0: | ||
# We don't want to calculate a hash of this file. Use the path. | ||
source_file = SourceFile.big_seed(match) | ||
else: | ||
file_contents = load_file_contents(match.absolute_path, strip=True) | ||
checksum = FileHash.from_contents(file_contents) | ||
checksum = FileHash.from_path(match.absolute_path) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have confirmed that this is not a "breaking" change, insofar as the same seed produces the same checksum before and after this change. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Update: The failing test on Windows seems to indicate that the seed checksum does indeed change, as a result of this PR, but only on Windows. The contributor mentioned this code comment as indication that we expect actually different checksums on Windows versus MacOS / Linux. In that test, the checksum for
I think our options are:
I lean toward option (2) for thoroughness. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder weather we want to just do a logic of:
This should provide a seamless upgrade experience and we don't need to have a flag at all. |
||
source_file = SourceFile(path=match, checksum=checksum) | ||
source_file.contents = "" | ||
source_file.parse_file_type = ParseFileType.Seed | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-breaking change in the
dbt/artifacts/
directory, so I will add theartifact_minor_upgrade
label to this PR