You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
NOTE: This was discovered against MySQL, loading data with offsets get slower the more the data is offset. This may not be an issue against other DBs
When indexing a domain class with greater than 1 million records, index performance decays and eventually dies. This is caused by the way data is loaded in ElasticSearchService.doBulkRequest(). The line:
Does a poor job of loading data as the offset increases.
Locally, we fixed this by making the following changes (corporate policy prevents me from submitting an actual pull request to this project, but I can suggest the fix... beauracracy!!!!)
//The loop
idResults?.collate(max)?.eachWithIndex { subList, i ->
//Other stuff here, then load the actual domains to index like this
def results = domainClass.createCriteria().list {
'in'('id', subList)
}
//everything else
}
Task List
Steps to reproduce provided
[N/A] Stacktrace (if present) provided
[N/A] Example that reproduces the problem uploaded to Github
Full description of the issue provided (see below)
Steps to Reproduce
Create 1 million or more domain objects to be indexed
Start application to index (or trigger an index after startup)
Observe that subsequent iterations of the bulk loop slowly decay
Expected Behaviour
Indexing would continue at a consistent pace regardless of number of records
Actual Behaviour
Indexing decays linearly, each iteration slowing until eventually data connections start timing out
Environment Information
Operating System: RHEL, MacOS Mojave
GORM Version: 7.0.2.RELEASE
Grails Version (if using Grails): 4.0.3
JDK Version:
java -version
openjdk version "1.8.0_192"
OpenJDK Runtime Environment (Zulu 8.33.0.1-macosx) (build 1.8.0_192-b01)
OpenJDK 64-Bit Server VM (Zulu 8.33.0.1-macosx) (build 25.192-b01, mixed mode)
Example Application
N/A
The text was updated successfully, but these errors were encountered:
NOTE: This was discovered against MySQL, loading data with offsets get slower the more the data is offset. This may not be an issue against other DBs
When indexing a domain class with greater than 1 million records, index performance decays and eventually dies. This is caused by the way data is loaded in
ElasticSearchService.doBulkRequest()
. The line:List<Class<?>> results = domainClass.listOrderById([offset: offset, max: max, readOnly: true, sort: 'id', order: "asc"])
Does a poor job of loading data as the offset increases.
Locally, we fixed this by making the following changes (corporate policy prevents me from submitting an actual pull request to this project, but I can suggest the fix... beauracracy!!!!)
..snip..
Task List
Steps to Reproduce
Expected Behaviour
Indexing would continue at a consistent pace regardless of number of records
Actual Behaviour
Indexing decays linearly, each iteration slowing until eventually data connections start timing out
Environment Information
Example Application
The text was updated successfully, but these errors were encountered: