-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug for _index_documents_actions when the batch is too large #38987
Comments
Thanks for the feedback, we’ll investigate asap. |
I am not quite clear. Seems like you proposed to change to I don't see how it makes difference. |
Hi @HuskyDanny. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue. |
Hi @xiangyan99 , The following logic on handling kwargs will give you better idea why this would cause a problem. The following logic is trying to get the error_map from the kwargs instead of the direct parameter from the function causing the keyError problem. |
Could you help to elaborate further? The following logic is trying to get the error_map from the kwargs instead of the direct parameter from the function causing the keyError problem. You mean the code you proposed? Why there is a difference? |
Hi @HuskyDanny. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue. |
Describe the bug
The exception's handling will trigger keyerror for error_map
Function: def _index_documents_actions(self, actions: List[IndexAction], **kwargs: Any) -> List[IndexingResult]:
The key reason is the index() function will expect to pop error_map from kwargs, so you have to include it in the kwargs
To Reproduce
Steps to reproduce the behavior:
`
read json file
`
Expected behavior
Items should be splitted by half and get inserted to the index
Screenshots
Additional context
I fixed the hanlding by having the error_map in the kwargs:
`
def _index_documents_actions(self, actions: List[IndexAction], **kwargs: Any) -> List[IndexingResult]:
error_map = {413: RequestEntityTooLargeError}
kwargs["headers"] = self._merge_client_headers(kwargs.get("headers"))
kwargs["error_map"] = error_map
batch = IndexBatch(actions=actions)
try:
batch_response = self._client.documents.index(batch=batch, **kwargs)
return cast(List[IndexingResult], batch_response.results)
except RequestEntityTooLargeError:
if len(actions) == 1:
raise
pos = round(len(actions) / 2)
batch_response_first_half = self._index_documents_actions(
actions=actions[:pos], **kwargs
)
if batch_response_first_half:
result_first_half = batch_response_first_half
else:
result_first_half = []
batch_response_second_half = self._index_documents_actions(
actions=actions[pos:], **kwargs
)
if batch_response_second_half:
result_second_half = batch_response_second_half
else:
result_second_half = []
result_first_half.extend(result_second_half)
return result_first_half
`
The text was updated successfully, but these errors were encountered: