You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Already existing dataset databricks.ManagedTableDataset doesn't allow to specify the location of the stored files, which in some setups is crucial. There's already PR #251 for it, but it seems to be stale.
Context
I develop a number of kedro projects that are deployed to Databricks. Having a single dataset that handles both pandas and spark DFs, and can write into (and read from) DBX database would be a lifesaver, as long as I could specify the path.
Possible Implementation
In spark, it suffices to add path option to make table external. I'm not sure if it would be as simple here though.
Possible Alternatives
Adding an argument to ManagedTableDataset is also an option, but then the table wouldn't really be Managed - it might cause some confusion
The text was updated successfully, but these errors were encountered:
Sure, as soon as I'll be able to :)
In the meantime: would you rather create a separate ExternalTableDataset, with a lot of common code with ManagedTableDataset (possibly inherited?), or just add an option to set path (like in current PR) and risk a little confusion among Databricks users?
Description
Already existing dataset databricks.ManagedTableDataset doesn't allow to specify the location of the stored files, which in some setups is crucial. There's already PR #251 for it, but it seems to be stale.
Context
I develop a number of kedro projects that are deployed to Databricks. Having a single dataset that handles both pandas and spark DFs, and can write into (and read from) DBX database would be a lifesaver, as long as I could specify the path.
Possible Implementation
In spark, it suffices to add
path
option to make table external. I'm not sure if it would be as simple here though.Possible Alternatives
Adding an argument to ManagedTableDataset is also an option, but then the table wouldn't really be Managed - it might cause some confusion
The text was updated successfully, but these errors were encountered: