Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

investigate and fix if needed glue classifier header row issue #106

Open
thecaffiend opened this issue Sep 10, 2024 · 0 comments
Open

investigate and fix if needed glue classifier header row issue #106

thecaffiend opened this issue Sep 10, 2024 · 0 comments
Assignees

Comments

@thecaffiend
Copy link
Member

thecaffiend commented Sep 10, 2024

UPDATE 2024.09.11 Do not work this issue yet. Awaiting disposition of cape-ph/etl-gphl-cre-alert#7 as the problem appears to be in the ETL and not in the glue crawler association.

ORIGINAL ISSUE TEXT BELOW
We have seen in a few demo instances that the header row of some cleaned csv files (specifically coming from gphl-cre alerts) is being treated as data in the athena queries. This would imply our crawler classifier is not being used or is not functioning correctly. The issue could also be in the data files or potentially somewhere else.

We have tried to make this happen again recently with a data file @thecaffiend uses regularly and it did not happen. We need to use the data files used for the last HAI demo (peter has them) in this investigation.

If the investigation requires a change in the infra code, use this issue. If the problem is found elsewhere, we'll take an appropriate action and close this issue with comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants