-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-47948: Preload dataset type cache for Butler server #1125
Conversation
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found. Additional details and impacted files@@ Coverage Diff @@
## main #1125 +/- ##
==========================================
+ Coverage 89.46% 89.49% +0.02%
==========================================
Files 366 366
Lines 48781 48801 +20
Branches 5908 5900 -8
==========================================
+ Hits 43644 43676 +32
+ Misses 3721 3718 -3
+ Partials 1416 1407 -9 ☔ View full report in Codecov by Sentry. |
af520b8
to
90f060b
Compare
DatasetTypeCache is only used in a single file with a specific set of types, so it no longer needs to be generic. Upcoming changes to make it thread-safe and cloneable will be easier to reason about with concrete types.
In preparation for sharing DatasetTypeCache between threads, make its inner DynamicTables values immutable. The mutable portion moved to a separate cache inside DatasetTypeCache. As a side effect, this reduces the number of times we go to the DB to check for the existence of tag and calib tables.
This cache is used only by a single manager, and has never participated in the caching enable/disable logic associated with CachingContext. Getting it out of CachingContext encapsulates its creation and use within a single manager class, simplifying upcoming changes. This also removes some unused branches from the code.
Pre-fetch dataset types the first time a repository is accessed in Butler server, to avoid the need to re-fetch them in most later operations.
90f060b
to
d15e0bc
Compare
Trigger dataset type preload the first time a connection is made to the Butler server in each unit test, to better match the conditions that will exist in the actual server.
Fix a bug where loading a dataset type registered externally after the Butler had loaded a "full" dataset type cache would cause an assertion failure "Dataset type cache population is incomplete" due to only filling in one of the two caches.
3c649f7
to
096c8cf
Compare
Copy some notes from the comments on DM-42317, and update them for the changes to dataset type caching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, a couple of minor suggestions.
python/lsst/daf/butler/registry/datasets/byDimensions/_dataset_type_cache.py
Outdated
Show resolved
Hide resolved
@@ -449,10 +451,17 @@ def makeCalibTableSpec( | |||
return tableSpec | |||
|
|||
|
|||
DynamicTablesCache: TypeAlias = ThreadSafeCache[str, sqlalchemy.Table] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the name sounds like it is a cache for DynamicTables
instances, maybe just call it TableCache or similar?
@@ -438,22 +427,15 @@ def getDatasetRef(self, id: DatasetId) -> DatasetRef | None: | |||
run = row[self._run_key_column] | |||
record = self._record_from_row(row) | |||
dynamic_tables: DynamicTables | None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This declaration is not needed any more?
Co-authored-by: Andy Salnikov <[email protected]>
We now preload the dataset type cache the first time a repository is accessed in Butler server. This is an optimization to avoid needing to go to the database every time we need the definition of a dataset type.
In preparation for this change,
DatasetTypeCache
was tweaked to make it non-generic, and to make all inner cached values immutable.Checklist
doc/changes
configs/old_dimensions