Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileSource Copy Truncate #18

Open
matthayter opened this issue Jun 23, 2016 · 1 comment
Open

FileSource Copy Truncate #18

matthayter opened this issue Jun 23, 2016 · 1 comment

Comments

@matthayter
Copy link

_Copied from the groupon internal project before OSS occurred_

The copy operation invalidates the file handle in Tailer. When the file is truncated the Tailer sees that it is newer and attempts to read from the old file - with the bad handle.
If the new file were smaller than the read position in the old file this would seem to work. Except that any additional writes to the old file in the last period would be lost as per MAI-187. However, when MAI-187 is fixed you still have an issue with copy truncate specifically in Tailer because the old file handle is invalidated on copy preventing you from reading the additional writes in the last period!
I'm not sure what the fix is; perhaps we should not support copy truncate, in which case we should make that very clear. However, some further thought may reveal a solution.
WARNING: Although the Java client is not using copy truncate the Ruby and Node clients may already be doing so.

Ville comments:
The new implementation of StatefulTailer is not able to address this issue. It remains captured in the FileSourceTest and now is also documented as part of the StatefulTailerTest. It seems unlikely that we will support copy-truncate moving forward. There are possible strategies such as using a file system watcher; although this may still require understanding of the rotation scheme.

Here are two possible starting points:

  • Performing an iNode comparison
  • File system watcher library
    Although these may still require knowledge of the rotation scheme being used.
@vjkoskela
Copy link
Member

One of @BrandonArp 's coworkers had some ideas on how to refactor the tailer and file source. These changes may help make this issue more easily addressable.

However, the MAD project is actively moving away from file based input sources so it's unlikely that this will see much traction. If you @matthayter (or anyone else coming across this) has a use case for this you may need to take this work on.

Happy to review and provide feedback on plans and pull requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants