This repository provides easy access to three types of data:
- Tiny datasets provided within this repository.
- Open source datasets hosted on AWS S3.
- Programs to create datasets on demand.
The datasets are used for learning and experimenting, and to support blog posts and tech talks about CrateDB.
You will find datasets in CSV, JSON and Parquet formats within different subdirectories. Descriptions about them can be found in corresponding README files.
Crate.io hosts several public datasets in the AWS S3 bucket crate.sampledata
,
that you can easily query when experimenting with CrateDB.
https://s3.amazonaws.com/crate.sampledata/
Datasets used in courses at the CrateDB Academy can be found in the academy
subdirectory.
Datasets used in developer relations talks, workshops and other projects can be found in the devrel
subdirectory.
Before adding files to this repository, please make sure to install Git LFS beforehand. On macOS, it works like this:
brew install git-lfs
git lfs install
No special commands are needed other than setting it up once per machine and
configuring the .gitattributes
file once per repository. Git LFS blends well
into the regular git add
/ git commit
workflow.
Before adding files, please check the .gitattributes
file, which lists all the
file types assigned to be handled by Git LFS. If you are adding a file of another
type, make sure it is listed there.