Csvcat

csvcat is a very fast csv files compiler (with filtering) written in Go. Using concurrency, csvcat can concat and filter a huge number of files without loosing to much in terms of memeroy and processing time.

Using a dummy dataset generated from generate_set.py with 100 files that have each 100000 lines (around 1.4G) in concurrent and non-concurrent modes to filter 5 columns out of 10, cvscat take around (on an i7-7820HQ (8)):

$ ./csvcat --columns "B,A,E,C,F" --delimiter "," --directory "csvset" --concurrency=true
Number of files found: 100
============ Total 3.933102857s ===================
$ ./csvcat --columns "B,A,E,C,F" --delimiter "," --directory "csvset" --concurrency=false
Number of files found: 100
============ Total 10.588019261s ===================

Usage of `csvcat`

Usage of ./csvcat:
  -batch int
    	Batch size (default 30)
  -c	Set to false to ignore checking extension (default true)
  -columns string
    	Columns to be selected
  -concurrency
    	Set flag to disable concurrency (default true)
  -delimiter string
    	Csv delimiter of files (default ",")
  -directory string
    	Directory containing the files to be compilled (default ".")
  -output string
    	Output filename (default "output.csv")
  -v	Set to true to have verbose output

Here's an example of how you might run csvcat with its flags:

./csvcat --batch 20 --columns "A,B,C" --delimiter "," --directory files

csvcat expects every csv file to have a header in its first line where all the columns are labled so that it can filter the correct columns. If the csv file is not correctly formated (some lines have more/less columns), it will try to add an empty column in the correct location.

Building `csvcat`

To build csvcat you need to run:

go build .
// or
go install .

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
files		files
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
csvcat.go		csvcat.go
generate_set.py		generate_set.py
go.mod		go.mod

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Csvcat

Usage of `csvcat`

Building `csvcat`

About

Releases 3

Packages

Languages

License

zeddo123/csvcat

Folders and files

Latest commit

History

Repository files navigation

Csvcat

Usage of csvcat

Building csvcat

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Usage of `csvcat`

Building `csvcat`

Packages