Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition between recieving TABLE_MAP and querying Schema info #12

Open
banks opened this issue Apr 28, 2015 · 0 comments
Open

Race condition between recieving TABLE_MAP and querying Schema info #12

banks opened this issue Apr 28, 2015 · 0 comments

Comments

@banks
Copy link

banks commented Apr 28, 2015

Hi

This may or may not be regarded as a valid issue for you, but I've tried building a very similar solution to this in the past and this is one issue I found that you don't seem to have addressed.

Per the README, each time you see TABLE_MAP indicating a new table (for this binlog) you lookup schema data from the master you are replicating from in order to decode the key and types etc for further modifications.

This only works if your replication is always up to date. Specifically that the current master's schema is identical to the schema at the time that the binlog event you are processing was written.

That assumption might hold most of the time, but it makes the solution fragile to a lot of issues - any outage or lag that must be caught up significantly increases the risk that the schema you query from master is not the same as the schema that the current event was actually written under. Also you can't easily bootstrap from older logs or replay if you found a bug from a couple days ago even if you still have the logs around (assuming there was any change to any schema).

My theoretical solution (never got past a toy sadly) was to actually run a local MySQL instance colocated with the log parser, and actually play any CREATE, ALTER, DROP etc queries on it. So we don't replicate the actual data but we do have an identical version of the table structure as it was at the time of the current binlog event. Then any meta data about structure can be queried from that and be correct with respect to the current log.

Mostly I'm posting this to share, but partly because I'm interested to know if this has been an issue for you in practice.

I assume given the effort put in here that you are using this for some "real life" production system so I wondered if you have a plan to mitigate that issue?

A related one is how to handle MySQL failover correctly - if you have any insights on that I'd be very interested to hear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants