-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[4/n] add python api for replacing local file references with source control links #21675
Conversation
e362410
to
978e409
Compare
8497f0b
to
5e81cbf
Compare
978e409
to
7b3ff2a
Compare
3e2b861
to
4e1ad1b
Compare
Deploy preview for dagit-core-storybook ready! ✅ Preview Built with commit 52ed3c8. |
source_control_branch: str, | ||
repository_root_absolute_path: Union[Path, str], | ||
) -> Sequence[Union["AssetsDefinition", "SourceAsset", "CacheableAssetsDefinition"]]: | ||
if "gitlab.com" in source_control_url: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figure market share of these 2 is large enough for the fallback case not to matter too much, but worth flagging that we only explicitly can build code links for these right now
python_modules/dagster/dagster/_core/definitions/metadata/source_code.py
Outdated
Show resolved
Hide resolved
python_modules/dagster/dagster/_core/definitions/metadata/source_code.py
Show resolved
Hide resolved
python_modules/dagster/dagster/_core/definitions/metadata/source_code.py
Show resolved
Hide resolved
python_modules/dagster/dagster/_core/definitions/metadata/source_code.py
Outdated
Show resolved
Hide resolved
243a068
to
bee4264
Compare
b5206d2
to
366b4d4
Compare
bee4264
to
85750bb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving ts files to another PR would be nice for easier review.
Main question is whether we can infer some of these parameters on behalf of the user.
|
||
@experimental | ||
@whitelist_for_serdes | ||
class CodeReferencesMetadataValue(DagsterModel, MetadataValue["CodeReferencesMetadataValue"]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming: I think we can drop the s
and make CodeReferences
into just CodeReference
?
"Reference" is one of those nice irregular nouns that can describe a plurality in its singular form (e.g. a "reference" can hold many "references")
class CodeReferencesMetadataValue(DagsterModel, MetadataValue["CodeReferencesMetadataValue"]): | |
class CodeReferenceMetadataValue(DagsterModel, MetadataValue["CodeReferencesMetadataValue"]): |
Also need to update the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is showing as new because the stacked PRs were rebased - it's not new in this one - same w/ the .ts
files unfortunately. Should be fixed now at least.
See this PR for brief naming discussion - personally I bias towards References
because it makes clear that the value can hold more than one reference. When I see reference
in the wild I typically interpret it as singular unless it's used as an adjective (e.g. reference material
).
return self | ||
|
||
|
||
def local_source_path_from_fn(fn: Callable[..., Any]) -> Optional[LocalFileCodeReference]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use pathlib
:)
|
||
|
||
@experimental | ||
def link_to_source_control( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we potentially infer this information on behalf of the user by running a git
command?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, or just walking the folder structure (since in some cases the user may not install git in the target image, in the cloud context at least). Stacked PR will have a layered utility fn that will have default behavior.
395cffb
to
df47725
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proliferation of these public helpers makes me a little nervous, especially because there's functionality that's specific to individual technologies. E.g. should this be part of dagster-github instead? I don't have a grand unifying philosophy of how they should be named which also makes me a little nervous.
So I think worth getting @schrockn 's thoughts too if he has bandwidth.
@benpankow thanks for bearing with the agonizing naming etc. process on these public APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sryza thanks for flagging me on this one. I too share the concern about exposing all of these top-level exports. I didn't realize we were exposing them at the top-level. I'd rather not do that. I was assuming we were building these to get a heartbeat up and running for our own usage, rather than having them be publicly exposed APIs.
UrlMetadataValue as UrlMetadataValue, | ||
link_to_source_control as link_to_source_control, | ||
with_source_code_references as with_source_code_references, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't realize with_source_code_references
was a top-level export. Has this shipped already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it'll go out with next week's release, so we can definitely roll this back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will drop the top-level export for now.
def _convert_local_path_to_source_control_path_single_definition( | ||
base_source_control_url: str, | ||
repository_root_absolute_path: str, | ||
assets_def: Union["AssetsDefinition", "SourceAsset", "CacheableAssetsDefinition"], | ||
) -> Union["AssetsDefinition", "SourceAsset", "CacheableAssetsDefinition"]: | ||
from dagster._core.definitions.assets import AssetsDefinition | ||
|
||
# SourceAsset doesn't have an op definition to point to - cacheable assets | ||
# will be supported eventually but are a bit trickier | ||
if not isinstance(assets_def, AssetsDefinition): | ||
return assets_def | ||
|
||
metadata_by_key = dict(assets_def.metadata_by_key) or {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I already tagged @smackesey in at some point where I saw similar code but it would be far preferable in my view to try to structure this to work on a canonicalized AssetsDefinition
df47725
to
52ed3c8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool makes sense to me. Thanks for working through this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm as well!
52ed3c8
to
70dd254
Compare
Merge activity
|
70dd254
to
395c0c6
Compare
…control links (dagster-io#21675) ## Summary Adds the `link_to_source_control` utility fn which converts all local source code reference metadata in the passed assets to references to source control. These local paths are mapped to the path in source control using a user-passed local git root. The path from the git root to the file locally is used as the filepath within source control. For example: ```python @asset def my_asset() -> pd.DataFrame: ... @asset def my_other_asset() -> pd.DataFrame: ... defs = Definitions( assets=link_to_source_control( with_source_code_references([my_asset, my_other_asset]), source_control_url="https://github.com/dagster-io/dagster", source_control_branch="master", repository_root_absolute_path=file_relative_path(__file__, "../../"), ) ) ``` A stacked PR will introduce a utility method that will wrap both `link_to_source_control` and `with_source_code_references` and will decide whether to link to source control based on whether the definitions are being loaded in a cloud context. ## Test Plan Unit tests.
Summary
Adds the
link_to_source_control
utility fn which converts all local source code reference metadata in the passed assets to references to source control. These local paths are mapped to the path in source control using a user-passed local git root. The path from the git root to the file locally is used as the filepath within source control.For example:
A stacked PR will introduce a utility method that will wrap both
link_to_source_control
andwith_source_code_references
and will decide whether to link to source control based on whether the definitions are being loaded in a cloud context.Test Plan
Unit tests.