Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test for arSolrExistsQuery #1854

Closed
wants to merge 68 commits into from

Conversation

sbreker
Copy link
Member

@sbreker sbreker commented Aug 2, 2024

No description provided.

anvit and others added 30 commits July 26, 2024 10:45
Set up docker-file which starts up AtoM on port 63001, solr with 3
instances on ports 8981-8983, and a simple php page on port 9001.
WIP
Skeleton for CLI task and plugin added.
CLI task creates new solr collection, but doesn't populate the index
yet.
Converted elastic search information object model and all dependent
classes to solr equivalent.

WIP: arSolrPluginUtil still has a couple of Elastica references for
search specific functions that need to be adapted/converted for solr.
QubitSearch has not been converted, so no actual documents are added to
Solr index yet. Changes will need to be made to either search itself or
all of the update/populate methods inside models to add the documents to
the solr index.
Added solr specific getInstance, enable, and disable methods.
WIP: No documents are added to the solr index by the addDocument method
in arSolrPlugin.
WIP: Data is being indexed using solr's REST API, but it is not
searchable since copy fields have not been declared, and the schema is
dynamic
fields to be created:
- nested (fields: actorRelations, dates)
- object (fields: alternativeIdentifiers, digitalObject, findingAid,
  i18n)
Fixed the multiValued parameter in the api request to convert binary to
string accurately. Added code to skip adding fields to copy based
on the include_in_all mapping value. Also added a function parameter for
the stored api request parameter since the copy field does not need to
be stored.
Adding a WIP solr CLI search tool to search solr. Deleted the
experimental solr html search page since it is no longer needed.
Added a helper method for all the Solr HTTP requests and cleaned up the
plugin code making the requests.
Updated solr CLI search task to use API requests instead of using
SolrClient. Using edismax queries instead of regular solr queries to be
able to apply boost to fields. Accepting command line field inputs is
WIP.
Fixed CS-Fixer warnings
Added a class to handles Solr Queries and its parameters
arSolrSearchTask now accepts fields as a command line parameter. Also
fixed minor bugs with the search task and arSolrQuery, and removed the
10 row default from the solr search handler.
Moved solr query related classes to its own folder, added solr query
classes
WIP: All solr query classes are just placeholder skeletons extending
from arSolrAbstractQuery. Also, arSolrPluginQuery needs to adapt addAggs
to set up all the appropriate aggregation parameters
Added Range Query support for Solr, and also moved methods for adding
params to the Abstract Query class. Also cleaned up indentation.

Note about TODO:
Elastic Search uses a slightly different format than Solr for
Range Queries so all calls to create a Range Query will have to be
updated to use te Solr syntax instead of the ElasticSearch.
anvit and others added 22 commits July 26, 2024 10:45
Added classes for ResultSet and QubitSolrSearchPager. ResultSet needs to
be adapted to be able to ingest response correctly, and Qubit Types need
to be filtered out to be able to display information correctly.
Added a getDocument function to arSolrResultSet that formats the
documents from the result set into a nested associative array to match
the structure expected in the templates.
Add a new arSolrResult class for use with some templates that interface
directly with the result instead of interfacing with documents returned
by the result set.
Updated arSolrMatchAllQuery to skip fields entirely and use lucene
parser to fetch results. Also updated arSolrSearchTask to add support
for MatchAllQuery.
Update the docker config to remove references to the old solr folder.
Also remove the solr php package as it isn't needed (and available)
anymore.
Fix search results skipping documents due to offset field in the
incorrect location.
Updated arSolrBoolQuery to function with arSolrQuery, added support
for should, and added rudimentary support for it in arSolrSearchTask.
Add lucene based existence query for arSolrExistsQuery
Update arSolrRangeQuery to be compatible with Elastica like Range
Queries
The addSubProperties function calls in arSolrPlugin were setting stored
to false which was causing several fields to not be stored while
indexing.

TODO
Fixed this change, but multivalued fields were still broken
with the generated query setting all fields as single valued, and
indexed results still missing some info. Hard coded all fields as
multivalued for now to get working search results, but this will need to
updated.
Updated makeHttpRequest to use curl for better error handling. Also
updated autocomplete fields to be multivalued since for scenarios when
they have both referenceCode and titles, and performed some code
cleanups in arSolrPlugin.
Update solr to only use multivalue fields where they are needed.
TODO:

arSolrPlugin includes a large list of fields that are being used to test
field names for being multivalued. This should be updated to work via
yml.

arSolrMapping also has a change that sets all partial foreign type id's
to be multivalued, which needs to be updated to apply that to only the
required id fields.
Adds arSolrTermQuery for Term Queries using Solr. Also updated
arSolrMatchQuery to be an extension of arSolrTermQuery, and updated
arSolrQuery to provide more fuzzy results.
{
return [
'New arSolrExistsQuery with blank field' => ['field' => ''],
'New arSolrExistsQuery with null field' => ['field' => null],
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anvit and @melaniekung: thinking about this - should '' or null trigger a validation error in the constructor? What will happen if a query is executed against Solr with these values?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi steve, they both return parser errors

Adds PHPUnit tests for arSolrExistsQuery class. Refactor
arSolrExistsQuery to return exceptions when query generation
preconditions are not met before query statement is generated.
@sbreker sbreker force-pushed the dev/solr-plugin-wip-with-new-test branch from 185cc8d to 491c4d5 Compare August 2, 2024 20:59
@anvit anvit force-pushed the dev/solr-plugin-wip branch 2 times, most recently from d987262 to 28b10ea Compare August 2, 2024 22:22
@sbreker sbreker closed this Aug 2, 2024
@sbreker sbreker deleted the dev/solr-plugin-wip-with-new-test branch August 2, 2024 23:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants