Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change successfulSearchShardIndices to Set<Index> #16110

Merged
merged 3 commits into from
Oct 8, 2024

Conversation

dzane17
Copy link
Contributor

@dzane17 dzane17 commented Sep 27, 2024

Description

Change successfulSearchShardIndices to Set<Index>

Passing the entire Index object allows greater flexibility over index String names. successfulSearchShardIndices is used in the Query Insights plugin to determine query field types.

Related Issues

RFC opensearch-project/query-insights#69
PR opensearch-project/query-insights#130

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@dzane17 dzane17 added the backport 2.x Backport to 2.x branch label Sep 27, 2024
Copy link
Contributor

✅ Gradle check result for 9a5dac6: SUCCESS

Copy link

codecov bot commented Sep 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.03%. Comparing base (a81b868) to head (058fb78).
Report is 7 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #16110      +/-   ##
============================================
+ Coverage     71.94%   72.03%   +0.08%     
- Complexity    64612    64729     +117     
============================================
  Files          5298     5301       +3     
  Lines        301952   302355     +403     
  Branches      43627    43681      +54     
============================================
+ Hits         217247   217805     +558     
+ Misses        66884    66704     -180     
- Partials      17821    17846      +25     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

✅ Gradle check result for 3ce3c96: SUCCESS

@reta
Copy link
Collaborator

reta commented Sep 28, 2024

@dzane17 could you please provide more context behind this change: why do you need to expose the IndicesService to plugins? thank you

@reta
Copy link
Collaborator

reta commented Oct 2, 2024

@reta - How else will the QueryInsightsListener get notified of the search events, if the QueryInsightsListener is not returned as part of pluginComponents anymore.

@jainankitk that is exactly the issue that @dzane17 brought up and we may need to fix: the instances of classes returned by getGuiceServiceClasses() could implement SearchRequestOperationsListener as well but this fact is completely ignored at the moment.

@dzane17
Copy link
Contributor Author

dzane17 commented Oct 2, 2024

@reta @jainankitk I originally intended to solve this by moving the creation of SearchRequestOperationsCompositeListenerFactory to later in node.java so that we can capture listeners from both pluginComponents and pluginLifecycleComponents.

            final SearchRequestOperationsCompositeListenerFactory searchRequestOperationsCompositeListenerFactory =
                new SearchRequestOperationsCompositeListenerFactory(
                    Stream.concat(
                        Stream.of(
                            searchRequestStats,
                            searchRequestSlowLog,
                            searchTaskRequestOperationsListener,
                            queryGroupRequestOperationListener
                        ),
                        Stream.concat(
                            pluginComponents.stream()
                                .filter(SearchRequestOperationsListener.class::isInstance)
                                .map(SearchRequestOperationsListener.class::cast),
                            pluginLifecycleComponents.stream()
                                .filter(SearchRequestOperationsListener.class::isInstance)
                                .map(SearchRequestOperationsListener.class::cast)
                        )
                    ).toArray(SearchRequestOperationsListener[]::new)
                );

However, pluginLifecycleComponents & getGuiceServiceClasses() are called after searchRequestOperationsCompositeListenerFactory (and all other injectables) are already bound to the instance here.

We face a paradox where getGuiceServiceClasses() relies on dependency injection, but we are now attempting to return an object to be bound (QueryInsightsListener) from getGuiceServiceClasses().

@reta
Copy link
Collaborator

reta commented Oct 2, 2024

We face a paradox where getGuiceServiceClasses() relies on dependency injection, but we are now attempting to return an object to be bound (QueryInsightsListener) from getGuiceServiceClasses().

Oh I see, thanks @dzane17 , that's unfortunate, let me think it through a bit

@cwperks
Copy link
Member

cwperks commented Oct 2, 2024

Okay, we will need to add pluginLifecycleComponents to the SearchRequestOperationsCompositeListenerFactory list.

@jainankitk why do you need that? The QueryInsightsListener could implement the LifecycleComponent and that should be sufficient

@reta - How else will the QueryInsightsListener get notified of the search events, if the QueryInsightsListener is not returned as part of pluginComponents anymore. Below change was added to include the QueryInsightsListener for search events:

pluginComponents.stream()
                            .filter(p -> p instanceof SearchRequestOperationsListener)
                            .map(p -> (SearchRequestOperationsListener) p)

@jainankitk @dzane17 Idk if this would be helpful as well, but Security creates a SearchOperationListener within onIndexModule: https://github.com/opensearch-project/security/blob/830b341453908332e4004d519ac2e0b099dca1f6/src/main/java/org/opensearch/security/OpenSearchSecurityPlugin.java#L754-L837

The use-case for Security is Fls/Dls/FieldMasking feature which cluster administrators use to hide sensitive data stored in the clusters to users. Security must intercept search operations to apply the rules.

See also my previous comment here

@reta
Copy link
Collaborator

reta commented Oct 2, 2024

The use-case for Security is Fls/Dls/FieldMasking feature which cluster administrators use to hide sensitive data stored in the clusters to users. Security must intercept search operations to apply the rules.

Thanks @cwperks , I think onIndexModule is quite a viable alternative as well

@jainankitk
Copy link
Collaborator

The use-case for Security is Fls/Dls/FieldMasking feature which cluster administrators use to hide sensitive data stored in the clusters to users. Security must intercept search operations to apply the rules.

Thanks @cwperks , I think onIndexModule is quite a viable alternative as well

@cwperks - Please correct me if I am wrong, but onIndexModule can only be leveraged for listening shard level search events, not request level. We need the search request listener for what we are trying to achieve here! cc: @reta

@cwperks
Copy link
Member

cwperks commented Oct 3, 2024

@cwperks - Please correct me if I am wrong, but onIndexModule can only be leveraged for listening shard level search events, not request level. We need the search request listener for what we are trying to achieve here! cc: @reta

Correct, its shard-level. To my knowledge, the only request-level extension point is the handlerWrapper, but that is reserved for the security plugin to perform auth.

My knowledge of how the internals of Search works is limited, but my understanding is that a SearchRequest lands on a coordinator node that then uses the routing table to figure out where to route the internal shard search requests to. It then sends out shard search requests to various nodes and collects the results before returning the results to the client.

Is it possible to distinguish the request that lands on the coordinator node vs. the shard requests?

iirc on the transport layer its the difference between indices:data/read/search vs indices:data/read/search[s]

Edit: All transport actions can be intercepted with ActionPlugin.getActionFilters() extension point, but there are challenges resolving a generic ActionRequest -> Indices being acted upon.

@reta
Copy link
Collaborator

reta commented Oct 3, 2024

@cwperks - Please correct me if I am wrong, but onIndexModule can only be leveraged for listening shard level search events, not request level. We need the search request listener for what we are trying to achieve here! cc: @reta

Got it, thanks @jainankitk , so here is the way we could make it work, which is a bit cumbersome but solves the dependency problem, the QueryInsightsService in injected as the service class:

    @Override
    public Collection<Class<? extends LifecycleComponent>> getGuiceServiceClasses() {
        return List.of(QueryInsightsService.class);
    }

Since QueryInsightsListener is created before QueryInsightsService (and it is injectable), we could inject QueryInsightsListener right there and initialize it with the indicesService or QueryInsightsService (fe using setter) since none of this services are needed during QueryInsightsListener initialization:

@Inject
    public QueryInsightsService(
        final ClusterSettings clusterSettings,
        final ThreadPool threadPool,
        final Client client,
        final MetricsRegistry metricsRegistry,
        final NamedXContentRegistry namedXContentRegistry,
        final IndicesService indicesService,
        final QueryInsightsListener listener
    ) {
  }

@dzane17 dzane17 changed the title Adding MapperServiceProvider for plugins Change successfulSearchShardIndices to Set<Index> Oct 7, 2024
CHANGELOG.md Outdated Show resolved Hide resolved
@dzane17
Copy link
Contributor Author

dzane17 commented Oct 7, 2024

Thanks all for the input. Since indicesService does not contain mappings for ALL indices in the cluster, we have no choice but to retrieve mappings from cluster state. This involves some clunky map parsing so we will introduce a small cache in Query-Insights to limit the frequency of this field->fieldType lookup. ClusterService is already present in the plugin createComponents method so I have reduced this PR to remaining structural changes needed in core.

Signed-off-by: Ankit Jain <[email protected]>
Copy link
Contributor

github-actions bot commented Oct 8, 2024

❌ Gradle check result for a489774: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Oct 8, 2024

❌ Gradle check result for 058fb78: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@jainankitk
Copy link
Collaborator

jainankitk commented Oct 8, 2024

Seems due to caching changes:

[Test Result](https://build.ci.opensearch.org/job/gradle-check/49029/testReport/) (1 failure / ±0)

    [org.opensearch.indices.IndicesRequestCacheCleanupIT.testDynamicStalenessThresholdUpdate](https://build.ci.opensearch.org/job/gradle-check/49029/testReport/junit/org.opensearch.indices/IndicesRequestCacheCleanupIT/testDynamicStalenessThresholdUpdate/)

@sgup432 - Can you take a look at this?

Copy link
Contributor

github-actions bot commented Oct 8, 2024

✅ Gradle check result for 058fb78: SUCCESS

@jainankitk jainankitk merged commit 5279d21 into opensearch-project:main Oct 8, 2024
33 of 34 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-16110-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 5279d21f8c85e78a91159af6e17e52789355f2a4
# Push it to GitHub
git push --set-upstream origin backport/backport-16110-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-16110-to-2.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants