-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Search User Behavior Logging and Data Reuse for Relevance #4619
Comments
I would split this into three parts:
|
Re "Track that original query through all steps of querying the index: user typed 1) query -> 2) rewritten query -> 3) results from OpenSearch -> 4) reranked results outside of OpenSearch -> 5) actions taken by the end users (query again, abandon search, some other high value action)." |
I think I get what you mean and it helps to refine the ideas. I do agree that recording intermediate stages is not the highest priority release immediately along with the logging done outside OpenSearch. I think there's a trade-off we need to consider in logging the debugging information and we should explore this more: either we're going to go down the path of making sure the query as it ran is recoverable which means making sure we have the right versions of plugins, analyzers, indices, OpenSearch itself, the QueryDSL, rerankers, etc. Once we're talking about external rerankers, then there are more variables that we don't control. The other option would be to create a scalable system for optionally logging everything so we know what happens at every step. That will give more info, but is certainly harder to scale. Looking for more feedback and options here as well. |
|
For now, we're focusing on collecting the data for relevance tuning. In our first stage, we're looking at end-to-end behavior (user query to user actions). In our second stage, we'll be looking at the search pipeline. In both those cases, we'll be collecting data for high-level performance statistics (latency end-to-end, latency per pipeline stage), but it a future enhancement could collect shard-level data. |
To elaborate on "high value action (HVA - like a purchase, stream, download, or whatever the builder defines)", here are some actions/events that a user might want to track either within a session or beyond it:
The user should be able to define their own event types as well. Search systems exist in many domains, with different object types (product, document, person, company, ...) and different actions (buy, read, contact, apply for job, ...). Should we try to unify actions, e.g., "add as friend" = "purchase"? Should "add to cart from SRP" vs "from detail page" be different event types or the same event type, distinguished by page type (where is that recorded?). Should we try to align with others' definitions of actions, e.g., Google Analytics recommended events (only some of which are relevant to search)? Is there an industry standard or convention we should be following? Can events have additional information like "dollar value of action" -- that's mostly a generic analytics issue, but even for search analytics, there may be differences in user behavior around high-priced and low-priced items. Do we need to provide explicit support for multi-dimensional events (action=buy, pagetype=detail) or a hierarchy of events (buy is a supercategory of buy-on-srp and buy-on-detail-page)? Or should we leave this to the user? |
May be somewhat related to #72 |
Great proposal! I believe those are very valid user stories and adding supports mentioned in the RFC will definitely improve the visibility and the analytics experience. Also, the proposal has a bunch of overlaps with several query insights features we are building now. On a high level, in query insights we want to build a generic data collection, processing and exporting framework, adding support for query level recommendations, and also query insights dashboards to help users have better visibility into the search performance.
We are trying to cover those use cases with the Top N Queries feature! in 2.12 we are releasing the latency based top queries feature, but we will add more dimensions (like CPU, JVM usage) in the future releases. Also, "top queries with zero results, top avandoned queries" are great use cases we can consider building into the feature as well :).
Good point! I believe finding "similar" queries, and cluster those similar queries will be super useful. It would be a valuable information to the user, furthermore, we can build query cost estimation if we have a robust query clustering method. it will facilitate a bunch of other features like query rewrite, query sandboxing and tiered caching as well, since knowing "how expensive the query would be" can be a super important metrics for those features.
These components are actually built in the query insights framework, if would be great if we can reuse some of them.
Agreed! we should be careful about this and do thorough evaluations of factors like feature availability, recommendation SLA, and cost when determining what component to choose for a certain feature. |
Related to #12084 |
Let's use this RFC as a point of historical reference for #12084 |
What/Why
What are you proposing?
Currently, there is no way for users of OpenSearch to get a full picture of how search is being used without building their own logging and metrics collection system. This is a request for comments to the community to discuss needs for a standardized logging schema & collection mechanism. We want to work with the community to understand where we can make the most impactful improvements to help the most users in understanding how search is used in their applications and how they can tune results most effectively.
We believe that application builders using OpenSearch for e-commerce, product, and document based search have a common set of needs in how they collect and expose data for analytics and reuse. Regarding analytics, we believe builders, business users, and relevance engineers want to see metrics out of the box for any search application like top queries, top queries resulting in a high value action (HVA - like a purchase, stream, download, or whatever the builder defines), top queries with zero results, top abandoned queries, as well as more advanced analytics like similar queries in the long tail that may be helped by synonyms, query rewrites/expansion or other relevance tuning techniques. This same data can also be re-used to feed manual judgement and automated learning to improve relevance in the index.
What users have asked for this feature?
Highlight any research, proposals, requests or anecdotes that signal this is the right thing to build. Include links to GitHub Issues, Forums, Stack Overflow, Twitter, Etc
What problems are you trying to solve?
Template: When <a situation arises> , a <type of user> wants to <do something>, so they can <expected outcome>. (Example: When searching by postal code, a buyer wants to be required to enter a valid code so they don’t waste time searching for a clearly invalid postal code.)_
What is the developer experience going to be?
Does this have a REST API? If so, please describe the API and any impact it may have to existing APIs. In a brief summary (not a spec), highlight what new REST APIs or changes to REST APIs are planned. as well as any other API, CLI or Configuration changes that are planned as part of this feature.
Are there any security considerations?
Describe if the feature has any security considerations or impact. What is the security model of the new APIs? Features should be integrated into the OpenSearch security suite and so if they are not, we should highlight the reasons here.
Are there any breaking changes to the API
If this feature will require breaking changes to any APIs, ouline what those are and why they are needed. What is the path to minimizing impact? (example, add new API and deprecate the old one)
What is the user experience going to be?
Describe the feature requirements and or user stories. You may include low-fidelity sketches, wireframes, APIs stubs, or other examples of how a user would use the feature via CLI, OpenSearch Dashboards, REST API, etc. Using a bulleted list or simple diagrams to outline features is okay. If this is net new functionality, call this out as well.
Are there breaking changes to the User Experience?
Will this change the existing user experience? Will this be a breaking change from a user flow or user experience perspective?
Why should it be built? Any reason not to?
Describe the value that this feature will bring to the OpenSearch community, as well as what impact it has if it isn't built, or new risks if it is. Highlight opportunities for additional research.
What will it take to execute?
Describe what it will take to build this feature. Are there any assumptions you may be making that could limit scope or add limitations? Are there performance, cost, or technical constraints that may impact the user experience? Does this feature depend on other feature work? What additional risks are there?
Any remaining open questions?
What are known enhancements to this feature? Any enhancements that may be out of scope but that we will want to track long term? List any other open questions that may need to be answered before proceeding with an implementation.
Questions for the Community
Review & Validate this Proposal for tracking data through OpenSearch: opensearch-project/search-processor#12
The text was updated successfully, but these errors were encountered: