Search contexts (scrolls, pits) are not cleared from node if an index is relocated to another node and then deleted #116313
Labels
>bug
:Search Foundations/Search
Catch all for Search Foundations
Team:Search Foundations
Meta label for the Search Foundations team in Elasticsearch
Elasticsearch Version
8.x, 9.x
Installed Plugins
No response
Java Version
bundled
OS Version
darwin
Problem Description
Stale data could be left on nodes after an index is relocated to different nodes, if scrolls/pits were opened on the source node.
Note that the data will remain on the source node even if the index is deleted.
This is because when we clear the contexts here we ignore the
NO_LONGER_ASSIGNED
reason - but this is a feature, we should NOT be freeing the context for theNO_LONGER_ASSIGNED
reason because if we were to remove the contexts in case ofNO_LONGER_ASSIGNED
the existing opened PITs will not work anymore (searches on those existing PITs will receveiSearchPhaseExecutionException: all shards failed
)The solution here, perhaps something that warrants more discussion, is to transfer the existing contexts to the new nodes where the index relocates. (one could also argue that a possible solution is to enhance the delete index API to free up all context in the cluster - perhaps call something like https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/action/search/ClearScrollController.java#L150)
Current workaround to free up the space on the source nodes (the original nodes that hosted the index and where scrolls are still open, even after the index was deleted) is to restart the nodes as we clear the contexts on node stop. WARNING - this will clear ALL the contexts on the node that restarts so existing open PITs will start failing.
Steps to Reproduce
Run 2 elasticsearch nodes,
node-1
andnode-2
30 minutes after deletion, check logs for
node-1
and noticeYou can also check the data folder for
node-1
and notice the index files are still present there. Something along the lines ofLogs (if relevant)
No response
The text was updated successfully, but these errors were encountered: