You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For the new updates version of pipeline, as part of performance improvements (which have been substantial), some trade-offs with memory usage have had to be made. Various critical data is loaded out of Elasticsearch into memory on startup (see caching.py) and saved back at end (this results in an approximately order of magnitude speed improvement). In future for larger datasets like UK PSC some optimisation will be need to keep the size of this in memory data within acceptable limits. There is significant scope for this, various strings that are being stored have only certain values and could be represented as integers for instance.
The text was updated successfully, but these errors were encountered:
For the new updates version of pipeline, as part of performance improvements (which have been substantial), some trade-offs with memory usage have had to be made. Various critical data is loaded out of Elasticsearch into memory on startup (see caching.py) and saved back at end (this results in an approximately order of magnitude speed improvement). In future for larger datasets like UK PSC some optimisation will be need to keep the size of this in memory data within acceptable limits. There is significant scope for this, various strings that are being stored have only certain values and could be represented as integers for instance.
The text was updated successfully, but these errors were encountered: