You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At the IOOS DMAC meeting, there was talk about supporting other data types, such as tabular or forecast collections. Since I've said a few times that this should be possible and then was asked afterwards how to make it happen, here's a sketch of how to implement it for dataframes.
To give the same functionality as we currently have with datasets, we would want to allow plugins to be dataframe providers, and dataframe routers. To do that we need a plugin that both specifies the hooks that other plugins should be able to use, and mounts and serves the routers from those plugins.
The first thing to do is to define the additional dependencies that our dataframe plugins can use.
classDataFrameDeps(Dependencies):
dataframe_ids: Callable[..., list[str]] =Field(
description="Returns a list of all valid dataset ids"
)
dataframe: Callable[[str], pd.DataFrame] =Field(
description='Returns a dataframe, using the ``/<dataframe_id>`` in the path.'
)
This updated xpublish.Dependency now allows other plugins to depend upon a dataframe, or a list of all dataframe IDs.
Now we create a specification for the new dataframe hooks, which can largely match the specs defined for datasets plugin methods.
classDataframePluginSpec(Plugin):
@hookspecdefget_dataframe_ids(self) ->Iterable[str]:
"""Return an iterable of dataframe IDs that the plugin can provide"""@hookspec(firstresult=True)defget_dataframe(self, dataframe_id: str) ->Optional[pd.DataFrame]:
"""Return a dataframe requested by dataframe_id. If the plugin does not have the dataframe, return None"""@hookspecdefdataframe_router(self, deps: DataFrameDeps) ->APIRouter:
"""Return an API router that can work with Dataframes"""
The first method that needs to be implemented, is register_hookspec(), which needs to return the spec we defined above. (In theory we have docs on this, but largely it's a 'I'll get to it later'). This tells the plugin system what methods other plugins can implement, and allows us to really start extending the Xpublish without bringing new things into core.
Then we create an app_router method that both can list all dataframes, and pull in other plugins for both loading dataframes and adding new routes for them.
classDataframePlugin(Plugin):
...
app_router_prefix: str="/dataframes"app_router_tags: Sequence[str] = ["dataframe"]
...
@hookimpldefapp_router(self, deps: Dependencies):
router=APIRouter(
prefix=self.app_router_prefix,
tags=self.app_router_tags
)
defget_dataframe_ids():
"""Return the known dataframe IDs from all dataframe provider plugins"""df_ids= []
fornew_idsindeps.plugin_manager().hook.get_dataframe_ids():
df_ids.extend(new_ids)
returndf_idsdefget_dataframe(dataframe_id: str):
"""Returns a dataframe from dataframe provider plugins"""df=deps.plugin_manager().hook.get_dataframe(dataframe_id=dataframe_id)
ifdfisnotNone:
returndfraiseHTTPException(
status_code=404,
detail=f"Dataframe {dataframe_id} not found."
)
@router.get("/")defdataframe_ids():
"""Returns known dataframe IDs"""returnget_dataframe_ids()
df_deps=DataFrameDeps(
**deps.model_dump(),
dataframe=get_dataframe,
dataframe_ids=get_dataframe_ids
)
fornew_routerindeps.plugin_manager().hook.dataframe_router(deps=df_deps):
router.include_router(new_router, prefix="/{dataframe_id}")
returnrouter
Within our app_router we start by defining methods to both get a dataframe and dataframe_ids. These directly access deps.plugin_manager().hook. This is the same pattern as the core of xpublish uses to deps.dataset_ids and deps.dataset.
Then we create a route to return all dataframe IDs.
Next we build our new dataframe dependencies, with the existing deps, and the addition of our two new dependencies.
Then we play with the plugin_manager again, and ask it for all the dataframe_router implementations and pass it the new dataframe dependencies. For each one of these, we include them in the app router we're building.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
At the IOOS DMAC meeting, there was talk about supporting other data types, such as tabular or forecast collections. Since I've said a few times that this should be possible and then was asked afterwards how to make it happen, here's a sketch of how to implement it for dataframes.
To give the same functionality as we currently have with datasets, we would want to allow plugins to be dataframe providers, and dataframe routers. To do that we need a plugin that both specifies the hooks that other plugins should be able to use, and mounts and serves the routers from those plugins.
Imports
The first thing to do is to define the additional dependencies that our dataframe plugins can use.
This updated
xpublish.Dependency
now allows other plugins to depend upon a dataframe, or a list of all dataframe IDs.Now we create a specification for the new dataframe hooks, which can largely match the specs defined for datasets plugin methods.
Now that we have those, we create our plugin.
The first method that needs to be implemented, is
register_hookspec()
, which needs to return the spec we defined above. (In theory we have docs on this, but largely it's a 'I'll get to it later'). This tells the plugin system what methods other plugins can implement, and allows us to really start extending the Xpublish without bringing new things into core.Then we create an app_router method that both can list all dataframes, and pull in other plugins for both loading dataframes and adding new routes for them.
Within our
app_router
we start by defining methods to both get a dataframe and dataframe_ids. These directly accessdeps.plugin_manager().hook.
This is the same pattern as the core of xpublish uses todeps.dataset_ids
anddeps.dataset
.Then we create a route to return all dataframe IDs.
Next we build our new dataframe dependencies, with the existing deps, and the addition of our two new dependencies.
Then we play with the plugin_manager again, and ask it for all the dataframe_router implementations and pass it the new dataframe dependencies. For each one of these, we include them in the app router we're building.
Now for some example plugins
A dataframe provider.
And a CSV router
Hopefully this gives some folks something to start from to explore adding new data types to Xpublish.
Beta Was this translation helpful? Give feedback.
All reactions