-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DPE-4196] Plugin Management Refactor #435
base: main
Are you sure you want to change the base?
Conversation
…run_cmd on Keystore class
…er-change-before-started-set
def _on_s3_relation_broken(self, event: EventBase) -> None: | ||
"""Processes the non-orchestrator cluster events.""" | ||
self.charm.status.clear(S3RelMissing) | ||
if self.charm.unit.is_leader(): | ||
self.charm.status.clear(S3RelShouldNotExist, app=True) | ||
logger.info("Non-orchestrator cluster, abandon s3 relation event") | ||
return | ||
|
||
def _on_s3_relation_action(self, event: EventBase) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to the base class
self.charm.status.set(BlockedStatus(PluginConfigError)) | ||
# There was an unexpected error, log it and block the unit | ||
logger.error(e) | ||
self.charm.status.set(BlockedStatus(PluginConfigError)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason to separate the exceptions here
raise OpenSearchError() | ||
# Will have a format similar to: | ||
# Version: 2.14.0, Build: tar/.../2024-05-27T21:17:37.476666822Z, JVM: 21.0.2 | ||
output = self.run_bin("opensearch", "--version 2>/dev/null") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avoid running through the API call early in the plugin setup routine
@@ -522,11 +581,12 @@ async def test_knn_training_search(ops_test: OpsTest, deploy_type: str) -> None: | |||
await _wait_for_units(ops_test, deploy_type) | |||
|
|||
# Now use it to compare with the restart | |||
assert await is_each_unit_restarted(ops_test, APP_NAME, ts) | |||
assert not await is_each_unit_restarted(ops_test, APP_NAME, ts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we are now making sure the plugin manager did not trigger a restart
HealthColors.GREEN, | ||
HealthColors.IGNORE, | ||
]: | ||
event.defer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This defer brings no real benefit, as another update-status
will forcefully happen.
def _on_s3_relation_action(self, event: EventBase) -> None: | ||
"""No deployment description yet, fail any actions.""" | ||
logger.info("Deployment description not yet available, failing actions.") | ||
event.fail("Failed: deployment description not yet available") | ||
|
||
def _request(self, *args, **kwargs) -> dict[str, Any] | None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Closes #244: remove proxy method
from cryptography import x509 | ||
from cryptography.hazmat.primitives import hashes, serialization | ||
from cryptography.hazmat.primitives.asymmetric import rsa | ||
from cryptography.x509.oid import NameOID | ||
|
||
|
||
def patch_wait_fixed() -> Callable: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speeds up tenacity by replacing the wait for a smaller range one
…s if they are internal
if self.charm.unit.is_leader(): | ||
self.charm.status.clear(BackupSetupFailed, app=True) | ||
self.charm.status.set(BlockedStatus(BackupSetupFailed)) | ||
self.charm.status.set(BlockedStatus(BackupSetupFailed), app=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following conversation on: https://chat.canonical.com/canonical/pl/oqwsfdejufgepettcqzobnw7xh
It is neither guaranteed nor wanted by the entire team to support a: all units in a given status -> app status updated automatically
@@ -34,46 +34,25 @@ class OpenSearchKeystoreError(OpenSearchError): | |||
"""Exception thrown when an opensearch keystore is invalid.""" | |||
|
|||
|
|||
class OpenSearchKeystoreNotReadyYetError(OpenSearchKeystoreError): | |||
"""Exception thrown when the keystore is not ready yet.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error is thrown when we try to reach out to the to reload the keys via API and it fails.
Cherry pick from a PR originally merged on main Sync charm docs from https://discourse.charmhub.io + CA rotation tests break down --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: a-velasco <[email protected]>
Sync charm docs from https://discourse.charmhub.io Co-authored-by: a-velasco <[email protected]>
) Updates the data platform workflow to the latest version: 22.0.0. Breaks down the integration tests between 3.6/beta for nightly run + 3.5 for PR / merges to `2/edge` branch. --------- Co-authored-by: Carl Csaposs <[email protected]>
…python-3-12' into DPE-4196-improve-plugin-manager
The main goal is to: (1) separate API call (/_cluster) between plugin-specific calls and cluster settings; (2) curb the requirements for restart for any plugin changes; (3) get a faster response time using cached entries wherever possible; (4) add new use-case with secret handling within the plugin logic itself; and (5) define models so we can standardize the plugin data exchanged between the objects and via relation.
Dev experience
The idea is to make it easier to add management for separated plugins without impacting the rest of the code. A dev willing to add a new plugin must decide: do we need to manage a relation or just config options on the charm?
If not, then we can add a config-only plugin
If yes, then we will need new a new object to encapsulate the plugin handling: the "DataProvider"
The config-only plugin
These are plugin configured via config options. In this case, it is only needed to add a new OpenSearchPlugin -child class that manages the config options to be added or removed from the cluster.
For example, opensearch-knn receives the config from the charm and returns the options to be set in the opensearch.yml.
The relation-based plugins
These plugins are more elaborate, as they have to process events specific to the given plugin. We also must consider the case of large deployments, where data may come via dedicated relation or the peer-cluster relation.
These plugins should be managed by a separate entity, named the relation manager. Defining a common structure for the relation manager is outside of the scope of this PR.
For example, repository-s3 and OpenSearch backup.
New Plugin Manager Infra
Now, the plugin manager is able to manage plugins that depend on config options, API calls and secrets. Whenever adding a new plugin, we should consider:
opensearch_plugins.py: this plugin should have a representation that is consumable by plugin_manager; it should be composed of all the configurations and keys to be added or removed to the cluster's main configuration
opensearch_plugin_manager.py: add the new plugin to the plugin dict; the manager must be able to instantiate this new plugin
opensearch_{plugin-name}.py: if the plugin is managed by a given relation, this lib will implement the relation manager and interface with OpenSearch's plugin-specific APIs
models.py: add any relation data model to this lib
Using the new plugin data provider
While specific classes takes care of the plugin's API calls (e.g. /_snapshot API for the backup plugin is done by OpenSearchBackup class), the data provider facilitates the exchange of relation data between the specific class and the plugin_manager itself. This way, the plugin manager can apply any cluster-wide configurations that are needed for that plugin.
We need a class to do deal with relation specifics as some plugins may expect different relations depending on their deployment description, e.g. OpenSearchBackupPlugin. The OpenSearchPluginDataProvider encapsulates that logic away from the main plugin classes.
Secret Management
Each plugin that handles the specific secrets must implement the secret management logic in its operation. The goal is to avoid filling the opensearch_secrets.py methods with ifs for each plugin case and separating / isolating each plugin code.
Remove unneeded restarts and add caching
We ensure that any configuration changes that come from plugin management are applied via API before being persisted on config files. If the API responds with a 200 status, then we should only write the new value to the configuration and finish without a need for restart.
In case the service is down and API is not available, we can assume we will eventually start the service back up. In this case, it suffices to write the config entries to the files and leave to the next start to pick them up.
This task is going to be divided into 3x parts:
Addresses low ranging fruits where we reduce the number of restarts and add caching support
Three main actions: (i) Merge {add,delete}_plugin together and its equivalents in OpenSearchPluginConfig class; (ii) we receive one big dictionary where a key: None means we want to simply delete that entry; and (iii) the main OpenSearchKeystore must observe secret changes and update its values accordingly
Returns unit tests: this is going to be commented out whilst Parts 1 and 2 happen, given this part of the code was covered with extensive testing
The current implementation of plugin_manager.run waits for the cluster to be started before processing its config changed. We relax this demand and open to the option where the cluster is not yet ready, so we can modify the configuration without issuing a restart request.
#252 is closed with OpenSearchPluginRelationsHandler interface. It allows plugins to define how they will handle its relation(s). opensearch_backup module extends this interface and defines a checker to process either small or large deployments details.
Other relevant changes:
Closes #252, #280, #244