FM2-642: Improve performance of loading locations by tag #547

ibacher · 2024-08-08T21:07:33Z

Description of what I changed

As we know, the FHIR Location end point is quite a bit slower than it ought to be. Based on the profiling of individual requests, I thought most of this overhead was just the overhead of translating to FHIR. However, our new performance test suite convinced me otherwise since it shows a median response time of 5 seconds for loading 50 locations (the minimum response time of 132ms is closer to the overhead I expected).

As a result I did some extensive profiling of the locations search request which pointed to two relatively easy-to-fix hotspots:

For Locations (and a few other resources) we support mapping attribute types to FHIR Telecom elements (phone numbers, URLs, email addresses, etc.). The implementation of this would hit the database on every request to load a global property, which was consuming an excessive amount of time, especially because this GP is unlikely to actually be used in many real-world scenarios.
For any pieces of OpenMRS Metadata, we provide the ability to localize the name of the metadata using the same technique supported by the UIFramework and REST module. While not as drastic as the above, this also consumes an excessive amount of time, especially as a Location resource can involve multiple pieces of metadata, especially if there is any location hierarchy.

For the first of these I made the following optimizations:

The FhirGlobalPropertyService implementation that reads from the database from every request has been swapped to an implementation backed by a custom GlobalPropertyListener that heavily caches any GP look-ups. This means that anywhere we were hitting the database to check GP values are now completely handled in-memory, drastically reducing the time to lookup.
In the base where no contact point attributes are defined, there should now be at most one DB look-up to confirm this instead of the previous at least two.

For the second of these, directly invoking getMessage(String, Object[], Locale) is about 50% faster than invoking getMessage(), which seems to be due to Spring overhead. While individually this doesn't mean much, across millions of invocations of the getMetadataTranslation() method, it shaves a substantial portion of time.

Between these two optimizations, I reduced the time to make 1 million search requests for locations by tag from 25 minutes to about 4 minutes, so this hopefully represents a serious improvement in perceptible performance.

Additionally, while doing this, I noticed that it was actually common (in the various tests) to wind up with multiple copies of FhirContext objects. FhirContext objects are extremely expensive to construct, so I've switched things up so we should only end up with at-most two FhirContexts (one for R4, one for DSTU3), which should speed up the time it takes for tests to run and may have a small impact on runtime performance in the real world too.

Outstanding work:

I still think there's a role for some aggressive per-request caching; I'm working on an API for this, especially to be applied to the code that handles concepts, since this also appears to be very slow (hence, I think, the relatively poor performance of getting Observations, Conditions, and Allergies). This work will need to be a little more careful, but I'm hoping we produce similar, if not better gains, across those domains, making the FHIR API much snappier for bulk results.
These changes need to be run in a real environment and put through their paces a bit more before they are ready-to-go.

Issue I worked on

see https://openmrs.atlassian.net/browse/FM2-642

Checklist: I completed these to help reviewers :)

My IDE is configured to follow the code style of this project.

No? Unsure? -> configure your IDE, format the code and add the changes with git add . && git commit --amend
I have added tests to cover my changes. (If you refactored
existing code that was well tested you do not have to add tests)

No? -> write tests and add them to this commit git add . && git commit --amend
I ran mvn clean package right before creating this pull request and
added all formatting changes to my commit.

No? -> execute above command
All new and existing tests passed.

No? -> figure out why and add the fix to your commit. It is your responsibility to make sure your code works.
My pull request is based on the latest changes of the master branch.

No? Unsure? -> execute command git pull --rebase upstream master

ibacher · 2024-08-08T21:30:36Z

cc: @jayasanka-sack @dkayiwa

ojwanganto · 2024-08-08T21:39:58Z

Thanks, @ibacher. Any work on performance is highly appreciated. Thanks for the clear description of the implementation approach

ibacher · 2024-08-13T17:09:11Z

I have no idea why the tests are failing here... they pass locally in every way I can figure out how to run them.

moshonk · 2024-09-08T22:22:21Z

@ibacher Thanks for this PR and the detailed description. Here's some feedback from my local tests.

The above changes have introduced a significantly high initial loading time for the module in my instance whenever it's run alongside the event module (tried 2.8.0 to 2.10.0). My 8GB M1 Mac laptop did not succeed in loading the module at all. See the error below:

INFO - ServiceContext.startRefreshingContext(821) |2024-09-08T19:00:29,387| Refreshing Context
INFO - EhCacheManagerFactoryBean.afterPropertiesSet(134) |2024-09-08T19:00:39,727| Initializing EhCache CacheManager
WARN - IdgenTask.run(32) |2024-09-08T19:11:09,803| Not running scheduled task. DaemonToken = null; enabled = false
WARN - BrokerService.checkSystemUsageLimits(1886) |2024-09-08T19:11:19,059| Store limit is 102400 mb, whilst the data directory: ../activemq-data/localhost/KahaDB only has 39771 mb of usable space
ERROR - BrokerService.checkSystemUsageLimits(1925) |2024-09-08T19:11:19,059| Temporary Store limit is 51200 mb, whilst the temporary data directory: ../activemq-data/localhost/tmp_storage only has 39771 mb of usable space

Reverting to the previous commit restores the initial(normal) loading time.

This reverts commit ca8bf98.

FM2-642: Improve performance of loading locations by tag

a845ea3

ibacher marked this pull request as draft August 8, 2024 21:30

ibacher added 2 commits August 9, 2024 08:23

Formatting changes

791664a

Attempt to fix failing build

e33bef4

ibacher marked this pull request as ready for review August 9, 2024 15:18

Attempt to fix tests

09c3199

ibacher merged commit ca8bf98 into master Sep 5, 2024
1 of 2 checks passed

ibacher deleted the FM2-642 branch September 5, 2024 10:56

ibacher added a commit that referenced this pull request Sep 17, 2024

Revert "FM2-642: Improve performance of loading locations by tag (#547)"

6842295

This reverts commit ca8bf98.

ibacher added a commit that referenced this pull request Oct 8, 2024

FM2-642: Improve performance of loading locations by tag (#547)

2a5d9f6

ibacher mentioned this pull request Oct 8, 2024

FM2-642: Improve performance of loading locations by tag (#547) #552

Merged

5 tasks

ibacher added a commit that referenced this pull request Oct 8, 2024

FM2-642: Improve performance of loading locations by tag (#547) (#552)

983b724

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FM2-642: Improve performance of loading locations by tag #547

FM2-642: Improve performance of loading locations by tag #547

ibacher commented Aug 8, 2024 •

edited

Loading

ibacher commented Aug 8, 2024

ojwanganto commented Aug 8, 2024

ibacher commented Aug 13, 2024

moshonk commented Sep 8, 2024

FM2-642: Improve performance of loading locations by tag #547

FM2-642: Improve performance of loading locations by tag #547

Conversation

ibacher commented Aug 8, 2024 • edited Loading

Description of what I changed

Issue I worked on

Checklist: I completed these to help reviewers :)

ibacher commented Aug 8, 2024

ojwanganto commented Aug 8, 2024

ibacher commented Aug 13, 2024

moshonk commented Sep 8, 2024

ibacher commented Aug 8, 2024 •

edited

Loading