You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Race condition found when indexTask with ES, the index requests' count sent by conductor server are not matched by received on ES side
Steps To Reproduce
Steps to reproduce the behavior:
Run multiple tasks in parallel
Change ES index log to debug
Logged requests in ES (Using ES7 , may same in ES6)
Some task status finished in IN_PROGRESS rather than COMPLETED after workflow COMPLETED
On the other hand, the persistency component status is right (using postgres as persistency)
indexBatchSize is default as 1 and asyncIndexingEnabled is also default as false
Even better - add a Loom video where you walk through the steps of the error.
Expected behavior
All the task should in terminated status, such as COMPLETED/FAILED in ES rather than IN_PROGRESS
Device/browser
OS: Ubuntu
Browser N/A
Version 3.14
Additional context
When debug log opened, following log printed right in our env, we have 3 index requests per task, which logged in ElasticSearchRestDAOV7.java -> indexTask, the average time cost is less than 30 ms
Time taken {} for indexing task:{} in workflow: {}
On ES side, the received records count is less than 3 randomly
Seem that, there is a race condition in function indexObject and indexBulkRequest,
`
private void indexObject(
final String index, final String docType, final String docId, final Object doc) {
byte[] docBytes;
try {
docBytes = objectMapper.writeValueAsBytes(doc);
} catch (JsonProcessingException e) {
logger.error("Failed to convert {} '{}' to byte string", docType, docId);
return;
}
IndexRequest request = new IndexRequest(index);
request.id(docId).source(docBytes, XContentType.JSON);
if (bulkRequests.get(docType) == null) {
bulkRequests.put(
docType, new BulkRequests(System.currentTimeMillis(), new BulkRequest()));
}
bulkRequests.get(docType).getBulkRequest().add(request);
if (bulkRequests.get(docType).getBulkRequest().numberOfActions() >= this.indexBatchSize) {
indexBulkRequest(docType);
}
}
private synchronized void indexBulkRequest(String docType) {
if (bulkRequests.get(docType).getBulkRequest() != null
&& bulkRequests.get(docType).getBulkRequest().numberOfActions() > 0) {
synchronized (bulkRequests.get(docType).getBulkRequest()) {
indexWithRetry(
bulkRequests.get(docType).getBulkRequest().get(),
"Bulk Indexing " + docType,
docType);
bulkRequests.put(
docType, new BulkRequests(System.currentTimeMillis(), new BulkRequest()));
}
}
}`
No lock found when add request to bulkRequest in indexObject
lock found when sent bulkRequest and removed local bulkRequest in indexBulkRequest
When exec with order in 2 threads as, T1 sent bulkRequest -> T2 add request to bulkRequest -> T2 might wait on synchronized of indexBulkRequest -> T1 removed local bulkRequest -> T2 runs into indexBulkRequest and failed with check, nothing to be sent/or even if T3 added a new one to empty bulkRequest so that the check past ...
Thanks
The text was updated successfully, but these errors were encountered:
Describe the bug
Race condition found when indexTask with ES, the index requests' count sent by conductor server are not matched by received on ES side
Steps To Reproduce
Steps to reproduce the behavior:
Even better - add a Loom video where you walk through the steps of the error.
Expected behavior
All the task should in terminated status, such as COMPLETED/FAILED in ES rather than IN_PROGRESS
Device/browser
Additional context
Time taken {} for indexing task:{} in workflow: {}
On ES side, the received records count is less than 3 randomly
Seem that, there is a race condition in function indexObject and indexBulkRequest,
`
Thanks
The text was updated successfully, but these errors were encountered: