Refactor auth filter to use temporary search pipeline #218
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hybrid queries cannot be nested inside other queries (such as
bool
queries). Instead, we should use a search pipeline to add arequest_processors[].filter_query
processor to the query. Ourdc-v2-work-pipeline
already does this, limiting the search to published results with a visibility of public or institution. This PR updates the search preprocessor to respect the search pipeline (if provided), or to add one on the fly if not. This is safe to do in this context because:This PR also strips out any
search_pipeline
stanza included in the query DSL to prevent users from circumventing the built-in restrictions.In the future, we may want to create multiple saved search pipelines to handle each of our four use cases (anonymous/institution search, superuser search, reading room search, and hybrid search) and simply update the
search_pipeline
query param on the way to OpenSearch. But this is our best option for now.To test, run the API using
sam local start-api
and then use Postman to post some search requests.READING_ROOM_IPS
server environment variable to match your own IP to make sure you get private, published works backRemember that you may have to narrow your search or increase the
size
in the search DSL to make sure you're not just excluding private/unpublished due to a result size cutoff.