Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with some categorical variables from UKB main dataset #7

Open
olivierlabayle opened this issue May 8, 2023 · 0 comments
Open
Labels
bug Something isn't working

Comments

@olivierlabayle
Copy link
Member

olivierlabayle commented May 8, 2023

There is an issue with some categorical variables that are correctly written as strings but later interpretd as integers by the CSV.jl parser. The workaround is to declare each individual category as an output phenotype at the moment while this is issue is being processed on the CSV.jl package (JuliaData/CSV.jl#1086).

example: ethnicity has values 1001, 2001, etc... this will be converted to integer while this does not make any sense.

if not dealt with on the CSV.jl side, the only option is to forward the column types as a header. Alternatively, Arrow.jl may deal with it better and we could retly entirely on it in the pipeline.

@olivierlabayle olivierlabayle added the bug Something isn't working label May 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant