Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding provision to invoke stop replication from other plugins #1391

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

aggarwalShivani
Copy link

@aggarwalShivani aggarwalShivani commented Jun 5, 2024

Description

This is related to the feature unfollow-action #726 and is the 2nd PR in continuation of the proposed solution mentioned in common-utils-PR. Detailed information is explained there.

Change description:

  1. Removing the classes StopIndexReplicationAction and StopIndexReplicationRequest from the ccr project and import StopIndexReplicationRequest and ReplicationActions from common-utils instead.
  2. Creating a new transport action, TransportUnfollowIndexReplicationAction - of type HandledTransportAction. This reads the request of type ActionRequest, transforms it to type StopIndexReplicationRequest, and invokes the TransportStopIndexReplicationAction.execute() on it.
  3. Adding a new ActionHandler to invoke TransportUnfollowIndexReplicationAction (for inter-plugin communication) , in addition to the one for TransportStopIndexReplicationAction (for REST API).

Issues Resolved

Related Issues
unfollow-action #726

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@ankitkala
Copy link
Member

Also tagging @saikaranam-amazon for review

@bowenlan-amzn
Copy link
Member

Here's the background of making these changes to enable calling action from another plugin
opensearch-project/notifications#223

} catch (e: Exception) {
log.error("Stop replication failed for index[${transformedRequest.indexName}] with error ${e.stackTraceToString()}")
listener?.onFailure(e)
throw e
Copy link
Member

@bowenlan-amzn bowenlan-amzn Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably no need to rethrow the exception here, as the exception is sent back to the caller to handle?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. I had corrected that change in my local, but missed pushing it. My bad!

@ankitkala
Copy link
Member

I've some concerns around moving the classes out of cross cluster replication package.
Ideally either of these 2 options should solve the problem:
Option 1: Move the ISM action into ccr plugin. From what i understand, ISM actions can be implemented as an plugin. CCR can take a dependency on ISM and implement the stop action in this repo.
Option 2: ISM plugin can take a dependency on CCR and reuse the classes. (We do same for security plugin.)

@bowenlan-amzn
Copy link
Member

@Ankit Kala commented on Jul 15, 2024, 7:06 AM PDT:

I've some concerns around moving the classes out of cross cluster replication package.
Ideally either of these 2 options should solve the problem:
Option 1: Move the ISM action into ccr plugin. From what i understand, ISM actions can be implemented as an plugin. CCR can take a dependency on ISM and implement the stop action in this repo.
Option 2: ISM plugin can take a dependency on CCR and reuse the classes. (We do same for security plugin.)

We didn't try Option 1 and it looks promising, and we may not need a proxy transport class then since ISM base doesn't have to know anything about CCR.

@aggarwalShivani some quick notes for trying it out
ISM has this interface, https://github.com/opensearch-project/index-management/blob/dbd2bc25f5e338819237af5a6b35c6c1e35d7171/spi/src/main/kotlin/org.opensearch.indexmanagement.spi/IndexManagementExtension.kt#L16.
You can implement that in ReplicationPlugin, and provide the getISMActionParsers
In build.gradle, take a compileOnly dependency on opensearch-index-management-spi. Probably need extendedPlugins = ['opensearch-index-management'] in opensearchplugin {} block
And a resources/META-INF/services/org.opensearch.indexmanagement.spi.IndexManagementExtension file that have org.opensearch.replication.ReplicationPlugin in it for the SPI mechanism

@bowenlan-amzn
Copy link
Member

bowenlan-amzn commented Jul 18, 2024

@aggarwalShivani

Have a PR to publish ISM spi so it can be consumed from central repo
opensearch-project/index-management#1207

And here's the commit of extend ISM in CCR
3853901

I tested using your script and can see everything works as expected it actually cannot work out of box, seeing lots of jar hell issue.


@ankitkala I tried the method of using ISM spi, but seems still have the same problem

If we extend ReplicationPlugin with ISM spi, that means CCR can only be installed after ISM installed (a hard dependency introduced), and there are a lot of jar hells, probably not what we want here.

If we create a new plugin in CCR, like CCR-ISM-extension plugin, it probably will also suffer the same problem that one plugin cannot use the TransportAction from another plugin directly. As plugin has their own classloaders, same class become different after being loaded by different classloaders.

So I think it's better to stick with the current approach now.
For your concern

I've some concerns around moving the classes out of cross cluster replication package.

It seems to be a 2 way door, moving classes around probably won't introduce any issues between versions. But let me know any concerns.

@aggarwalShivani
Copy link
Author

aggarwalShivani commented Jul 19, 2024

Hi @ankitkala ,
Just in case you didn't get a notification for latest reponse by @bowenlan-amzn (as I didn't too)....

We see issues with Option 1 ->

If we extend ReplicationPlugin with ISM spi, that means CCR can only be installed after ISM installed (a hard dependency introduced), and there are a lot of jar hells, probably not what we want here.
...
If we create a new plugin in CCR, like CCR-ISM-extension plugin, it probably will also suffer the same problem that one plugin cannot use the TransportAction from another plugin directly. As plugin has their own classloaders, same class become different after being loaded by different classloaders.

Even with option 2, it would reverse the problem on ism. And also, if we reuse classes from one plugin in another, we would see class-loading issue i.e. plugins have their own classloaders, same class become different after being loaded by different classloaders. (explained in opensearch-project/notifications#223 as well)

For your concern

I've some concerns around moving the classes out of cross cluster replication package.

To add in addition to Bowen's comment, in our proposed approach, we've ensured that the current REST API users of CCR would not get impacted functionality-wise by any change. '
And other plugins like notifications, alerting also have some of their classes in common-utils..
Pls let us know your thoughts on this.

@skumarp7 skumarp7 deleted the commonutils-migr branch July 23, 2024 16:17
@skumarp7 skumarp7 restored the commonutils-migr branch July 23, 2024 16:26
@ankitkala
Copy link
Member

ankitkala commented Jul 24, 2024

@ankitkala I tried the method of using ISM spi, but seems still have the same problem

If we extend ReplicationPlugin with ISM spi, that means CCR can only be installed after ISM installed (a hard dependency introduced), and there are a lot of jar hells, probably not what we want here.

I didn't get this part. All the SPI classes are getting loaded in Opensearch process already, we should just be having it as compile time dependency right?

If we create a new plugin in CCR, like CCR-ISM-extension plugin, it probably will also suffer the same problem that one plugin cannot use the TransportAction from another plugin directly. As plugin has their own classloaders, same class become different after being loaded by different classloaders.

I went through the issue. I still have questions around why we'd want to load same class into JVM twice through different classloaders.

So I think it's better to stick with the current approach now. For your concern

I've some concerns around moving the classes out of cross cluster replication package.

It seems to be a 2 way door, moving classes around probably won't introduce any issues between versions. But let me know any concerns.

I definitely understand the issues with the integration and if we really need to move classes out to common utils, we'll do that. But before we do that, i just want to ensure that we've exhausted all options here :-)

@aggarwalShivani
Copy link
Author

Hi @ankitkala , Thanks for your feedback..

I went through the issue. I still have questions around why we'd want to load same class into JVM twice through different classloaders.

I'll try to answer this from my understanding and @bowenlan-amzn could add-on/correct me as needed :)

In the notifications example as well as in the case of this PR, the flow is - ism plugin needs to invoke actions from other plugins, for ex. SendNotificationAction or TransportStopIndexReplicationAction. These are transport actions in their respective plugins, which accept request of fixed type like StopIndexReplicationRequest.

To invoke these transport action, ism would need to first construct request object of the required type (ex. StopIndexReplicationRequest), invoke the action from the other plugin and finally expect the response of type returned by that action i.e. the request class object is created in ism, but it has to be used in ccr code.

OpenSearch loads each plugin by different class loader separately, which means the same class StopIndexReplicationRequest would have to be recreated/reloaded into ccr's classloader - which causes issues. Same case with response classes too.

@ankitkala
Copy link
Member

have to be recreated/reloaded into ccr's classloader

Hey @aggarwalShivani, what i don't understand is why we need to load the ccr classes at runtime. Compile time dependency makes sense, but at runtime, these classes are already loaded into the opensearch jvm(even if its through a different classloader).

@bowenlan-amzn
Copy link
Member

bowenlan-amzn commented Aug 5, 2024

@Ankit Kala commented on Jul 29, 2024, 8:23 PM PDT:

have to be recreated/reloaded into ccr's classloader

Hey @aggarwalShivani, what i don't understand is why we need to load the ccr classes at runtime. Compile time dependency makes sense, but at runtime, these classes are already loaded into the opensearch jvm(even if its through a different classloader).

Valid question.
What you are suggesting is to take the stop replication request from a compile dependency of ISM. And we should have that request class at runtime if CCR plugin is also installed in OpenSearch.

@ankitkala can your team help our contributor here to enable this workflow you are thinking? I'm not sure if CCR is publishing any jar for now.

Note, based on how class loaded in OpenSearch, we will need the CCR jar load from opensearch core I guess, otherwise it probably not recognizable from other plugin as it's loaded by CCR plugin classloader?

@bowenlan-amzn
Copy link
Member

bowenlan-amzn commented Aug 5, 2024

@Ankit Kala commented on Jul 24, 2024, 8:45 AM PDT:

@ankitkala I tried the method of using ISM spi, but seems still have the same problem

If we extend ReplicationPlugin with ISM spi, that means CCR can only be installed after ISM installed (a hard dependency introduced), and there are a lot of jar hells, probably not what we want here.

I didn't get this part. All the SPI classes are getting loaded in Opensearch process already, we should just be having it as compile time dependency right?

If we define CCR as extended plugin of ISM (refer here), when we install the plugin, it hit this check of jarhell
https://github.com/opensearch-project/OpenSearch/blob/05fb780a3aec3bfb46c0cb76c65e340b11d42def/server/src/main/java/org/opensearch/plugins/PluginsService.java#L679

It requires that the extension plugin CCR to not have any duplicate dependencies as its extended plugin ISM.

@aggarwalShivani
Copy link
Author

@ankitkala can your team help our contributor here to enable this workflow you are thinking? I'm not sure if CCR is publishing any jar for now.

Note, based on how class loaded in OpenSearch, we will need the CCR jar load from opensearch core I guess, otherwise it probably not recognizable from other plugin as it's loaded by CCR plugin classloader?

Hi @ankitkala,
Pls let us know if you have any feedback/views on this 🙂

@ankitkala
Copy link
Member

@ankitkala can your team help our contributor here to enable this workflow you are thinking? I'm not sure if CCR is publishing any jar for now.
Note, based on how class loaded in OpenSearch, we will need the CCR jar load from opensearch core I guess, otherwise it probably not recognizable from other plugin as it's loaded by CCR plugin classloader?

Hi @ankitkala, Pls let us know if you have any feedback/views on this 🙂

Hi, unfortunately we don't have bandwidth to pick this up.
In spirit of moving forward, let's go ahead with the proposed changes.

Copy link
Member

@ankitkala ankitkala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Few minor comments.
Also, please ensure that we get a clean build by merging the common utils changes followed by rerunning the tests in this PR again.

@@ -68,7 +70,7 @@ class TransportStopIndexReplicationAction @Inject constructor(transportService:
IndexNameExpressionResolver,
val client: Client,
val replicationMetadataManager: ReplicationMetadataManager) :
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment here explaining that the stop replication action class in moved to common utils and link this PR here?

* While TransportStopIndexReplicationAction is used directly by the _stop replication REST API,
* this action is used for inter-plugin communication by ism plugin to unfollow i.e. stop replication.
*/
class TransportUnfollowIndexReplicationAction @Inject constructor (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's rename is to something like TransportIsmStopIndexReplicationAction. Basically remove unfollow and add ISM in action name to distinct it from other supported actions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thanks. Suggestions were given in the ISM PR to name this class as InternalTransportStopIndexReplicationAction and the action name as "indices:internal/plugins/replication/index/stop", so the name is differentiated to be something internally invoked from plugins. We hadn't added the name ism, so it remains generic to be used by something else too in future.
I hope these names shall be okay?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TransportInternalStopIndexReplicationAction should be fine

@aggarwalShivani
Copy link
Author

Hi @ankitkala , @bowenlan-amzn
Apologies for the delayed update here. I was assigned different work responsibilities for a few months and couldn't proceed with this feature 😟

I'm back now and trying to revive this change request 🤞 There have been multiple changes in the three projects in this while, and I will have to validate my changes on the latest.

One issue I'm facing is ->
Common-utils and ISM projects have now updated to the base-line jdk to 21 from 11.
But CCR project is still on jdk11 (open issue).
After importing common-utils (built on jdk21) in ccr project (jdk 11), I'm getting this compile-time errors.
Cannot inline bytecode built with JVM bytecode version 65 into bytecode that is being built with JVM target 11. Please specify proper '-jvm-target' option

This basically happens when there is a mismatch of jdk in project and its dependency.
Pls suggest on how we can solve this. Also, can you help with when jdk21 update is planned on ccr?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants