Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow readCSV to read a folder of CSVs #826

Closed
Jolanrensen opened this issue Aug 20, 2024 · 1 comment
Closed

Allow readCSV to read a folder of CSVs #826

Jolanrensen opened this issue Aug 20, 2024 · 1 comment
Assignees
Labels
bug Something isn't working csv CSV / delim related issues enhancement New feature or request
Milestone

Comments

@Jolanrensen
Copy link
Collaborator

Jolanrensen commented Aug 20, 2024

This is a common practice in Python, Excel and other data wrangling environments.

Currently, DataFrame.readCSV("path/to/directory") results in a strange DataFrame with a single column containing all the filenames in the directory (where the first file is the column name...): #508

What should happen is something like:

Path("path/to/dir").listDirectoryEntries("*.csv").map {
    DataFrame.readCSV(it.toFile())
}.concat()

This should work in the gradle/ksp/compiler plugin too

@Jolanrensen Jolanrensen added bug Something isn't working enhancement New feature or request csv CSV / delim related issues labels Aug 20, 2024
@Jolanrensen Jolanrensen added this to the Backlog milestone Aug 20, 2024
@Jolanrensen Jolanrensen self-assigned this Aug 20, 2024
@Jolanrensen Jolanrensen mentioned this issue Aug 20, 2024
24 tasks
@Jolanrensen
Copy link
Collaborator Author

I must have misremembered, it also requires multiple steps to read multiple CSVs in, say, pandas: https://saturncloud.io/blog/how-to-import-multiple-csv-files-into-pandas-and-concatenate-into-one-dataframe/

@Jolanrensen Jolanrensen closed this as not planned Won't fix, can't repro, duplicate, stale Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working csv CSV / delim related issues enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant