Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

known data.table fread limitation #1

Open
Miachol opened this issue Dec 3, 2017 · 0 comments
Open

known data.table fread limitation #1

Miachol opened this issue Dec 3, 2017 · 0 comments
Labels

Comments

@Miachol
Copy link
Member

Miachol commented Dec 3, 2017

In fread function, the skip parameter can't to input > 2500000000. If the database file > 2500000000 lines, you need to split the raw database file.

For example:

/usr/bin/split -l 2499999999 hg19_eigen.txt hg19_eigen.txt_split
# if you have been write 2499999999 in sqlite file, you can start from "ab"
for( i in c("aa", "ab", "ac", "ad")) {
  system(sprintf("mv hg19_eigen.txt_split%s hg19_eigen.txt", i))
  new.colnames <- c("#Chr", "Start", "End", "Ref", "Alt", "Eigen")
  annovarR::sqlite.auto.build('eigen', database.dir = './', append = TRUE, new.colnames = new.colnames)
  system(sprintf("mv hg19_eigen.txt hg19_eigen.txt_split%s", i))
}
@Miachol Miachol added the bug label Dec 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant