You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@wavemoran brought up a good point today during a quick sprint demo of the project.
What would happen if we swapped the loader and processor server information in the configuration between the processing stage and loading stage (same as dump, processor, and loader all having identical server, database, and credentials set)? In this case we would be taking a dump file from the same server we would be loading into (same database, schemas, allthethings). This would cause issues if the dump and load servers were both production servers 😱... we do, however, make sure to NEVER run a DROP DATABASE during any execution paths in the load command pathways. Instead we do an atomic renaming of the primary database we are loading into by adding a timestamp to the name. Since we rename the primary database to the timestamp name we can then rename the anonymized database to the primary name.
We can fix this issue by adding metadata such as: hostname, port, and database to the pre-processor. This way we can read this data during the load command and verify that the server we are loading into is not the same server we dumped from.
The text was updated successfully, but these errors were encountered:
we should also add the ability to --disable this check just incase someone else has a use case where the ability to dump/load from the same resource is valid. By default we should not allow this behavior.
junkert
changed the title
Add checksumming to dump and processed files
Add dump server metadata to top of processed files
Mar 12, 2019
@wavemoran brought up a good point today during a quick sprint demo of the project.
What would happen if we swapped the loader and processor server information in the configuration between the processing stage and loading stage (same as dump, processor, and loader all having identical server, database, and credentials set)? In this case we would be taking a dump file from the same server we would be loading into (same database, schemas, allthethings). This would cause issues if the dump and load servers were both production servers 😱... we do, however, make sure to NEVER run a DROP DATABASE during any execution paths in the load command pathways. Instead we do an atomic renaming of the primary database we are loading into by adding a timestamp to the name. Since we rename the primary database to the timestamp name we can then rename the anonymized database to the primary name.
We can fix this issue by adding metadata such as: hostname, port, and database to the pre-processor. This way we can read this data during the load command and verify that the server we are loading into is not the same server we dumped from.
The text was updated successfully, but these errors were encountered: