Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Change databuilder search data extractors to publish name in user document. #2274

Merged
merged 4 commits into from
Oct 30, 2024

Conversation

glipR
Copy link
Contributor

@glipR glipR commented Oct 23, 2024

Description

Changes the search data databuilder extractors to extract a name field, for publishing to elasticsearch.

Currently, the user document schema in elasticsearch expects a name keyword, which is the primary field for searching users. (Expected from Databuilder publish, and expected from search service)

Motivation and Context

The change is required because currently the elasticsearch query does not use the name field to create better matches for search results, and instead is likely just using first/last and key of the user. This is fine when searching for just first or last, but when searching both the search results currently aren't great for this reason.

This is actually fixed in the example query (PR) for posting from neo4j to elasticsearch, but this query is never used.

How Has This Been Tested?

This has only been tested with the neo4j_search_data_extractor, and not the other extractors. Testing was done by:

  1. Loading from databuilder via the search_data_extractor script
  2. Navigating to the index on localhost:9200 and inspecting document values
  3. Attempting a search in the frontend and comparing results

Documentation

No change in documentation

CheckList

  • PR title addresses the issue accurately and concisely

Believe this fix has no need for:

  • N/A: Updates Documentation and Docstrings
  • N/A: Adds tests
  • N/A: Adds instrumentation (logs, or UI events)

@glipR glipR requested a review from a team as a code owner October 23, 2024 00:17
@boring-cyborg boring-cyborg bot added the area:databuilder From databuilder folder label Oct 23, 2024
Copy link

boring-cyborg bot commented Oct 23, 2024

Congratulations on your first Pull Request and welcome to Amundsen community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/amundsen-io/amundsen/blob/main/CONTRIBUTING.md)

@glipR glipR force-pushed the bug-user-es-document branch from e4b986b to e7ccc06 Compare October 23, 2024 00:19
@glipR glipR force-pushed the bug-user-es-document branch from e7ccc06 to b134aac Compare October 23, 2024 00:21
@kristenarmes
Copy link
Contributor

@glipR thanks for your contribution! I triggered the CI tests and see a couple minor test failures. if you can resolve, I'd be happy to approve

Signed-off-by: Jackson Goerner <[email protected]>
@glipR glipR force-pushed the bug-user-es-document branch from b188e3a to 8de4326 Compare October 25, 2024 02:48
@glipR
Copy link
Contributor Author

glipR commented Oct 25, 2024

Thanks @kristenarmes ! I've edited the two failing tests to reflect the new return values / expected arguments.

Apologies I'm having trouble triggering the tests locally due to some version conflicts so may need a retrigger to confirm that that has fixed everything 🙏

@glipR
Copy link
Contributor Author

glipR commented Oct 30, 2024

Didn't notice they'd been rerun, apologies for the delay :)

Looks like everything is passing now, thanks again for triggering these.

Copy link
Contributor

@kristenarmes kristenarmes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@kristenarmes
Copy link
Contributor

I was about to merge, but realized did you want to bump the databuilder version? here's the version spot

Signed-off-by: Jackson Goerner <[email protected]>
@glipR
Copy link
Contributor Author

glipR commented Oct 30, 2024

Sure thing - have bumped the patch version.

@kristenarmes kristenarmes merged commit 32b2ab3 into amundsen-io:main Oct 30, 2024
10 checks passed
Copy link

boring-cyborg bot commented Oct 30, 2024

Awesome work, congrats on your first merged pull request!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:databuilder From databuilder folder
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants