-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend the netcdf API to support programmatic changes to the plugin search path #3024
base: main
Are you sure you want to change the base?
Conversation
…earch path re: Unidata#2753 As suggested by Ed Hartnett, This PR extends the netcdf.h API to support programmatic control over the search path used to locate plugins. I created several different APIs, but finally settled on the following API as being the simplest possible. It has the disadvantage that it requires use of a global lock (not implemented) if used in a threaded environment. Specifically, note that modifying the plugin paths must be done "atomically". That is, in a multi-threaded environment, it is important that the sequence of actions involved in setting up the plugin paths must be done by a single processor or in some other way as to guarantee that two or more processors are not simultaneously accessing the plugin path read/write operations. As an example, assume there exists a mutex lock called PLUGINLOCK. Then any processor accessing the plugin paths should operate as follows: ```` lock(PLUGINLOCK); nc_plugin_path_read(...); <rebuild plugin path> nc_plugin_path_write(...); unlock(PLUGINLOCK); ```` The API proposed in this PR looks like this (from netcdf-c/include/netcdf_filter.h). * ````int nc_plugin_path_read(int formatx, size_t* ndirsp, char** dirs);```` This function returns the current sequence of directories in the internal plugin path list. Since this function does not modify the plugin path, it can be called at any time. The arguments are as follows: - _formatx_ specify which dispatch implementation to read: currently NC_FORMATX_NC_HDF5 or NC_FORMATX_NCZARR. - _ndirsp_ return the number of dirs in the internal path list - _dirs_ memory for storing the sequence of directies in the internal path list. In practice, this function needs to be called twice. The first time with npaths not NULL and pathlist set to NULL to get the size of the path list. The second time with pathlist not NULL to get the actual sequence of paths. * ````int nc_plugin_path_write(int formatx, size_t ndirs, char** const dirs);```` This function empties the current internal path sequence and replaces it with the sequence of directories argument. Using a paths argument of NULL or npaths argument of 0 will clear the set of plugin paths. The arguments are as follows: - _formatx_ specify which dispatch implementation to write: currently NC_FORMATX_NC_HDF5 or NC_FORMATX_NCZARR or 0 (zero). - _ndirs_ length of the dirs argument - _dirs_ a vector of directory path string used to overwrite the current internal path list If the value zero is used for the formatx argument, then the value being written is applied to all implemention: currently NC_FORMATX_NC_HDF5 and NC_FORMATX_NCZARR. In addition, two other API functions are defined. ```` int nc_plugin_path_initialize(void); int nc_plugin_path_finalize(void); ```` As a rule, the initialize and finalize functions do not need to be explicitly called by the user because they are called as part of *nc_initialize()/nc_finalize()*. In addition to the above changes, add a plugin path testcase: unit_tests/run_pluginpaths.sh+tst_pluginpaths.c. ## Misc. Changes 1. Added a version number for the formatx dispatcher. 2. Setup a per-dispatcher global state mechanism. 3. Add some path manipulation utilities to netcf_aux.h 4. Fix the construction of netcdf_json.h as a BUILT_SOURCE. 5. Fix some minor bugs in netcdf_json.h 6. Fix the construction of netcdf_proplist.h as a BUILT_SOURCE.
Dennis Heimbigner seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
Excellent. Some feedback:
I really like your simplification of the API to just two functions. Elegant. This is a good addition to help netCDF users cope with some of the complexities of plugins. It will help not just with zstandard users, but also even more advanced users who are currently using plugins. (I know the European Space Agency is doing this to get JPEG compression of some satellite data.) All these advanced users will also find it much more useful to set the plugin path in the code, instead of relying on an environment variable being set correctly. |
All reasonable changes. |
In thinking about this API, it occurs to me that users should not be forced to figure |
I think the capability for different plugin paths is marginal. That is, I don't know if anyone would really need that. (Why would they? So a different filter of the same number could be used? That doesn't sound like a good idea anyway.) So I think it's fine if you want to have one plugin path for every dispatch to use. I also think it's fine to have the user specify the dispatch layer. Since this is a capability only for very advanced users, I assume they could handle it. So either approach seems acceptable to me. |
re: #2753
As suggested by Ed Hartnett, This PR extends the netcdf.h API to support programmatic control over the search path used to locate plugins.
I created several different APIs, but finally settled on the following API as being the simplest possible. It has the disadvantage that it requires use of a global lock (not implemented) if used in a threaded environment.
Specifically, note that modifying the plugin paths must be done "atomically". That is, in a multi-threaded environment, it is important that the sequence of actions involved in setting up the plugin paths must be done by a single processor or in some other way as to guarantee that two or more processors are not simultaneously accessing the plugin path read/write operations.
As an example, assume there exists a mutex lock called PLUGINLOCK., then any processor accessing the plugin paths should operate as follows:
The API proposed in this PR looks like this (from netcdf-c/include/netcdf_filter.h).
int nc_plugin_path_read(int formatx, size_t* ndirsp, char** dirs);
This function returns the current sequence of directories in the internal plugin path list. Since this function does not modify the plugin path, it can be called at any time.
The arguments are as follows:
In practice, this function needs to be called twice. The first time with npaths not NULL and pathlist set to NULL to get the size of the path list. The second time with pathlist not NULL to get the actual sequence of paths.
int nc_plugin_path_write(int formatx, size_t ndirs, char** const dirs);
This function empties the current internal path sequence and replaces it with the sequence of directories argument. Using a paths argument of NULL or npaths argument of 0 will clear the set of plugin paths.
The arguments are as follows:
If the value zero is used for the formatx argument, then the value being written is applied to all implemention: currently NC_FORMATX_NC_HDF5 and NC_FORMATX_NCZARR.
In addition, two other API functions are defined.
As a rule, the initialize and finalize functions do not need to be explicitly called by the user because they are called as part of nc_initialize()/nc_finalize().
In addition to the above changes, add a plugin path testcase:
unit_tests/run_pluginpaths.sh+tst_pluginpaths.c.
Misc. Changes