You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'd like to both anonymize my data as well as decrease the size of the overall database size. Is there a mechanism such that I could specify a maximum number of records for a particular table, and delete any records prior to that maximum set?
The text was updated successfully, but these errors were encountered:
Unfortunately there is no way to do this currently, but it would be a rather easy to modify the generator handle row counts when processing the dump file. I'll see if I can find the time in the next couple of weeks to add this feature. I know we will need this eventually here at SmithRx as well.
On another note you can also minimize the size of your database by using the --exclude-table and --exclude-table-data options which allow you to exclude full tables (do not include DDL) or just the table data (keep DDL, but ignore data in the table) from the dump process.
For example, we have a table that is denormalized when first added to our database. This table is is very large and sparse until we process it and normalize the records. We choose to ignore this table's data during the dump process since we only care about the normalized data when testing.
@mateodelnorte looking into implementing this soon. Does the solution described above work for your use?
If we limit tables by size then we will not be able to keep foreign keys consistent between tables. If you do not care about foreign keys existing then the solution above should work.
If we want to ensure foreign key consistency with size limiting we will need to rewrite a large portion of the generator which may take a lot of time and would probably require a new product version release.
Thanks @junkert. I actually created a simple db-trim tool to do the same as is suggested above. Would be happy to use it as a part of this tool and not have to maintain mine. Overall, something that checks referential integrity would also be great. But, I'm sure you're thinking of that as well and recognize the increased complexity.
Thanks for this tool. It looks pretty great.
I'd like to both anonymize my data as well as decrease the size of the overall database size. Is there a mechanism such that I could specify a maximum number of records for a particular table, and delete any records prior to that maximum set?
The text was updated successfully, but these errors were encountered: