Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ETL does not process certain files correctly #7

Open
thecaffiend opened this issue Sep 11, 2024 · 0 comments
Open

ETL does not process certain files correctly #7

thecaffiend opened this issue Sep 11, 2024 · 0 comments

Comments

@thecaffiend
Copy link
Contributor

Not entirely sure what's up as the 2 files look very similar (it's a word doc, so looks mean little) with the exception of extra formatting (e.g. extra newlines in the "Dummy Data" file column headers and alert date).

The file GPHL CRE Alert Example.docx is processed as expected yielding the expected clean csv file. The file CRE Alert Dummy Data.docx yields a clean csv with the first 2 rows containing seemingly header data (and they are not the same, so it's not just a duplication of the row), which leads to the downstream data catalog queries including header data as though it was CRE Alert data.

CRE Alert Dummy Data GPHL.docx
GPHL CRE Alert Example.docx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant