Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to import a bundle without all dependencies included #31443

Open
3 tasks done
withnale opened this issue Dec 13, 2024 · 1 comment
Open
3 tasks done

Unable to import a bundle without all dependencies included #31443

withnale opened this issue Dec 13, 2024 · 1 comment
Labels
api:dashboard Related to the REST endpoints of the Dashboard dashboard:import Related to importing dashboards viz:charts:import Related to importing charts

Comments

@withnale
Copy link

Bug description

The example below is discussing the use case for /api/v1/assets/import but similar logic appears in the Import*Command method for dataset/chart/dashboard specific APIs and methods...

When trying to use /api/v1/assets/import/ to import a dashboard, it forces all resources to be imported.

If we look at the implementation, it reads something like this. Within each 'startwith' block in maintains a dict
of all the updated resources, and in the subsequent block it uses this as a lookup (only the code in the datasets block
is shown)

        # import databases first
        database_ids: dict[str, int] = {}
        for file_name, config in configs.items():
            if file_name.startswith("databases/"):
                ... 
        # import saved queries
        for file_name, config in configs.items():
            if file_name.startswith("queries/"):
                ...

        # import datasets
        dataset_info: dict[str, dict[str, Any]] = {}
        for file_name, config in configs.items():
            if file_name.startswith("datasets/"):
                config["database_id"] = database_ids[config["database_uuid"]]
                dataset = import_dataset(config, overwrite=True)
                dataset_info[str(dataset.uuid)] = {
                    "datasource_id": dataset.id,
                    "datasource_type": dataset.datasource_type,
                    "datasource_name": dataset.table_name,
                }
            
        # import charts
        charts = []
        chart_ids: dict[str, int] = {}
        for file_name, config in configs.items():
            if file_name.startswith("charts/"):
                dataset_dict = dataset_info[config["dataset_uuid"]]
                ...

        # import dashboards
        for file_name, config in configs.items():
            if file_name.startswith("dashboards/"):
                ...

Unfortunately this means that any dashboard update must also include the chart, dataset, query and database.

Whilst I can understand the reasoning for this to ensure there is no drift when exporting and importing, this does
cause problems propagating changes through different superset instances. It also causes problems in multi-tenant
environments where we want to maintain different update policies for charts/dashboard to datasets/databases.

An alternative that might seem possible is to introduce a new parameter to the import process similar to overwrite - for arguments sake, lets call it sparse. If this is set to true, if it cannot find the dependency uuid referenced in the zip bundle, it will look it up and use it from the running database.

Screenshots/recordings

No response

Superset version

master / latest-dev

Python version

3.9

Node version

16

Browser

Chrome

Additional context

No response

Checklist

  • I have searched Superset docs and Slack and didn't find a solution to my problem.
  • I have searched the GitHub issue tracker and didn't find a similar bug report.
  • I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Copy link

dosubot bot commented Dec 13, 2024

The current implementation of the /api/v1/assets/import/ endpoint in Superset requires all dependencies to be included in the import bundle. This means that when importing a dashboard, you must also include the associated charts, datasets, queries, and databases. This approach ensures consistency and prevents drift between exported and imported resources, but it can be problematic in scenarios like multi-tenant environments or when different update policies are needed for different resource types.

One potential solution to this issue is to introduce a new parameter, such as sparse, to the import process. This parameter would allow the import process to use existing resources from the running database if the required dependencies are not found in the import bundle. This would provide more flexibility in managing resources across different Superset instances and environments.

Currently, there is no built-in mechanism in Superset to handle missing dependencies in this way, but your suggestion could be a valuable enhancement to consider for future development.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@dosubot dosubot bot added api:dashboard Related to the REST endpoints of the Dashboard dashboard:import Related to importing dashboards viz:charts:import Related to importing charts labels Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api:dashboard Related to the REST endpoints of the Dashboard dashboard:import Related to importing dashboards viz:charts:import Related to importing charts
Projects
None yet
Development

No branches or pull requests

1 participant