known data.table fread limitation #1

Miachol · 2017-12-03T12:05:54Z

In fread function, the skip parameter can't to input > 2500000000. If the database file > 2500000000 lines, you need to split the raw database file.

For example:

/usr/bin/split -l 2499999999 hg19_eigen.txt hg19_eigen.txt_split

# if you have been write 2499999999 in sqlite file, you can start from "ab"
for( i in c("aa", "ab", "ac", "ad")) {
  system(sprintf("mv hg19_eigen.txt_split%s hg19_eigen.txt", i))
  new.colnames <- c("#Chr", "Start", "End", "Ref", "Alt", "Eigen")
  annovarR::sqlite.auto.build('eigen', database.dir = './', append = TRUE, new.colnames = new.colnames)
  system(sprintf("mv hg19_eigen.txt hg19_eigen.txt_split%s", i))
}

Miachol added the bug label Dec 3, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

known data.table fread limitation #1

known data.table fread limitation #1

Miachol commented Dec 3, 2017 •

edited

Loading

known data.table fread limitation #1

known data.table fread limitation #1

Comments

Miachol commented Dec 3, 2017 • edited Loading

Miachol commented Dec 3, 2017 •

edited

Loading