-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-47770: Fix threadsafety of sqlalchemy MetaData access #1132
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found.
Additional details and impacted files@@ Coverage Diff @@
## main #1132 +/- ##
=======================================
Coverage 89.50% 89.50%
=======================================
Files 366 366
Lines 48998 49011 +13
Branches 5943 5941 -2
=======================================
+ Hits 43855 43867 +12
Misses 3729 3729
- Partials 1414 1415 +1 ☔ View full report in Codecov by Sentry. |
e3b4297
to
56f4854
Compare
0180196
to
4736b46
Compare
Wrap all accesses to sqlalchemy.MetaData with a lock to avoid concurrency issues. sqlalchemy.MetaData is documented to be threadsafe for reads, but not with concurrent modifications. We add tables dynamically at runtime, and the MetaData object is shared by all Database instances sharing the same connection pool. Prior to adding the lock, Butler server database calls that added table definitions dynamically were sometimes failing with InvalidRequestError exceptions complaining about inconsistency of table definitions.
Fix an issue where a test for equality of filenames in the system temp directory were failing. On MacOS, /private/tmp and /tmp refer to the same directory.
4736b46
to
4c0370c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general, but I'm not sure that name mangling is done consistently.
@@ -164,13 +165,9 @@ def addTable(self, name: str, spec: ddl.TableSpec) -> sqlalchemy.schema.Table: | |||
to be declared in any order even in the presence of foreign key | |||
relationships. | |||
""" | |||
name = self._db._mangleTableName(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused, do you want DatabaseMetadata
to handle table name mangling internally? I do not think it is done consistently now, you call add_table
with the original name, but get_table
is called with mangled name?
@@ -1192,7 +1189,7 @@ def getExistingTable(self, name: str, spec: ddl.TableSpec) -> sqlalchemy.schema. | |||
""" | |||
assert self._metadata is not None, "Static tables must be declared before dynamic tables." | |||
name = self._mangleTableName(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this one needs to be dropped too?
@@ -1206,10 +1203,7 @@ def getExistingTable(self, name: str, spec: ddl.TableSpec) -> sqlalchemy.schema. | |||
) | |||
if name in inspector.get_table_names(schema=self.namespace): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, but this won't work if name is not mangled?
Wrap all accesses to sqlalchemy.MetaData with a lock to avoid concurrency issues.
sqlalchemy.MetaData is documented to be threadsafe for reads, but not with concurrent modifications. We add tables dynamically at runtime, and the MetaData object is shared by all Database instances sharing the same connection pool.
Prior to adding the lock, Butler server database calls that added table definitions dynamically were sometimes failing with InvalidRequestError exceptions complaining about inconsistency of table definitions.
Checklist
doc/changes
configs/old_dimensions