Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad scrolling performance #881

Open
hotrush opened this issue Jun 14, 2024 · 1 comment
Open

Bad scrolling performance #881

hotrush opened this issue Jun 14, 2024 · 1 comment

Comments

@hotrush
Copy link

hotrush commented Jun 14, 2024

Hello. We recently upgraded our cluster from v7 to v8 and had to migrate from olivere/elastic package to this official client at some service.

But after that we met serious performance degradation, service started responding 6-8 times longer than with client we used before (after reverting to olivere/elastic degradation is gone so this is confirmed).

Our code is pretty simple, we just scroll searches in goroutine and push data to a channel, see code below:

this is how client created

c, err := elasticsearch.NewTypedClient(elasticsearch.Config{
	Addresses: []string{makeEsURL()},
	Username:  cfg.EsUser,
	Password:  cfg.EsPass,
})

this is how we build initial query

esResults := make(chan esResult, 1)
...
scroll := es.Search().
	Index(r.GetIndex()).
	Raw(strings.NewReader(r.GetQuery())).
	Size(cfg.EsScrollSize).
	Scroll("15s")

err := scrollEsQuery(ctx, scroll, esResults)

this is how it is processed

func scrollEsQuery(ctx context.Context, scroll *search.Search, esResults chan<- esResult) error {
	var scrollID string
	res, err := scroll.Do(ctx)
	if err != nil {
		return err
	}
	if len(res.Hits.Hits) == 0 {
		return nil
	}

	esResults <- newEsResult(res.Hits)

	if res.ScrollId_ != nil {
		scrollID = *res.ScrollId_
	}

	defer func() {
		if scrollID != "" {
			es.ClearScroll().ScrollId(scrollID).Do(gCtx)
		}
	}()

	for {
		res, err := es.Scroll().ScrollId(scrollID).Scroll("15s").Do(ctx)
		if err != nil {
			return err // something went wrong
		}
		if res.ScrollId_ != nil {
			scrollID = *res.ScrollId_
		}
		if len(res.Hits.Hits) == 0 {
			return nil
		}

		select {
		case <-ctx.Done():
			return gCtx.Err()
		case esResults <- newEsResult(res.Hits):
		}
	}
}

Code is pretty simple and i can't understand what can cause such a big performance difference, am I doing anything wrong?

@Anaethelion
Copy link
Contributor

Hi @hotrush

I've tested your snippet and something comes to my attention, you are passing a raw request and setting the size at the same time. Size is trying to be set in the body but fails because raw takes precedence. Hence size is not properly propagated to the body of the request and you are effectively returning the full match of your request on the first call.

Can you check this assumption? If that is so I'll work on a way of preventing that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants