Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to set refresh to wait_for? #121

Open
bjhijmans opened this issue Oct 12, 2022 · 6 comments
Open

Is there a way to set refresh to wait_for? #121

bjhijmans opened this issue Oct 12, 2022 · 6 comments

Comments

@bjhijmans
Copy link

I'm in the process of switching to this package, but there is an elasticsearch feature I've been using, but I can't find a way to set it using this package.

https://www.elastic.co/guide/en/elasticsearch/reference/7.17/docs-refresh.html

I would like to be able to set refresh to wait_for. We use it in tests to see if our filters and score boosts yield the expected results from elasticsearch. Is there a way to set the refresh parameter?

I think this PR might be related, but the author closed it. #89

@bjhijmans
Copy link
Author

I ended up copying ElasticDocumentAdapter and adding 'refresh' => 'wait_for' when appropriate there and injecting it into the service container from the AppServiceProvider and that fixes my issue completely.

However, it feels a bit hacky and subject to problems caused by code changes in the package. I think it would be helpful if there was a supported way to add these sorts of parameters.

@Jeroen-G
Copy link
Owner

I'm curious for your changes, how does your adapter now look like?

@bjhijmans
Copy link
Author

bjhijmans commented Oct 18, 2022

Here's the whole thing. I added a config key for the one setting I wanted to add in the myapp namespace.

Obviously this isn't modular at all, but I think you get the idea.

namespace App\Libraries\ElasticSearch;

use Elasticsearch\Client;
use JeroenG\Explorer\Application\DocumentAdapterInterface;
use JeroenG\Explorer\Application\Operations\Bulk\BulkOperationInterface;
use JeroenG\Explorer\Application\Results;
use JeroenG\Explorer\Application\SearchCommandInterface;
use JeroenG\Explorer\Infrastructure\Elastic\ElasticClientFactory;
use JeroenG\Explorer\Infrastructure\Elastic\Finder;

/**
 * This is a copy of JeroenG\Explorer\Infrastructure\Elastic\ElasticDocumentAdapter modified to allow us to set
 * refresh=wait_for in elasticsearch
 */
final class ElasticDocumentAdapter implements DocumentAdapterInterface
{
    private Client $client;

    public function __construct(ElasticClientFactory $clientFactory)
    {
        $this->client = $clientFactory->client();
    }

    public function bulk(BulkOperationInterface $command): callable|array
    {
        return $this->client->bulk($this->withSettings([
            'body' => $command->build(),
        ]));
    }

    public function update(string $index, $id, array $data): callable|array
    {
        return $this->client->index($this->withSettings([
            'index' => $index,
            'id' => $id,
            'body' => $data,
        ]));
    }

    public function delete(string $index, $id): void
    {
        $this->client->delete($this->withSettings([
            'index' => $index,
            'id' => $id
        ]));
    }

    public function search(SearchCommandInterface $command): Results
    {
        return (new Finder($this->client, $command))->find();
    }

    protected function withSettings(array $array): array
    {
        if (config('myapp.elastic_wait_for')) {
            $array['refresh'] = 'wait_for';
        }

        return $array;
    }
}

@Jeroen-G
Copy link
Owner

Is it possible to set those settings with the Elasticsearch\ClientBuilder class that is constructed in the service provider?

@bjhijmans
Copy link
Author

I don't see how. Here's the code of one of the methods in the Client:

    /**
     * Allows to perform multiple index/update/delete operations in a single request.
     *
     * $params['index']                  = (string) Default index for items which don't provide one
     * $params['type']                   = DEPRECATED (string) Default document type for items which don't provide one
     * $params['wait_for_active_shards'] = (string) Sets the number of shard copies that must be active before proceeding with the bulk operation. Defaults to 1, meaning the primary shard only. Set to `all` for all shard copies, otherwise set to any non-negative value less than or equal to the total number of copies for the shard (number of replicas + 1)
     * $params['refresh']                = (enum) If `true` then refresh the affected shards to make this operation visible to search, if `wait_for` then wait for a refresh to make this operation visible to search, if `false` (the default) then do nothing with refreshes. (Options = true,false,wait_for)
     * $params['routing']                = (string) Specific routing value
     * $params['timeout']                = (time) Explicit operation timeout
     * $params['_source']                = (list) True or false to return the _source field or not, or default list of fields to return, can be overridden on each sub-request
     * $params['_source_excludes']       = (list) Default list of fields to exclude from the returned _source field, can be overridden on each sub-request
     * $params['_source_includes']       = (list) Default list of fields to extract and return from the _source field, can be overridden on each sub-request
     * $params['pipeline']               = (string) The pipeline id to preprocess incoming documents with
     * $params['require_alias']          = (boolean) Sets require_alias for all incoming documents. Defaults to unset (false)
     * $params['body']                   = (array) The operation definition and data (action-data pairs), separated by newlines (Required)
     *
     * @param array $params Associative array of parameters
     * @return array
     * @see https://www.elastic.co/guide/en/elasticsearch/reference/master/docs-bulk.html
     */
    public function bulk(array $params = [])
    {
        $index = $this->extractArgument($params, 'index');
        $type = $this->extractArgument($params, 'type');
        $body = $this->extractArgument($params, 'body');

        $endpointBuilder = $this->endpoints;
        $endpoint = $endpointBuilder('Bulk');
        $endpoint->setParams($params);
        $endpoint->setIndex($index);
        $endpoint->setType($type);
        $endpoint->setBody($body);

        return $this->performRequest($endpoint);
    }

It only uses params that were passed to the method, in our case from the Adapter. And even if you overwrite $endpointBuilder somehow to set parameters, it would be immediately overwritten by the $endpoint->setParams($params); on the next line. I don't see any way around it.

@Jeroen-G
Copy link
Owner

I understand. Thanks for helping out so far. I'll have to think of a nice way to solve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants