The full paper can be found in Releases. Source paper can be found at report.typ. You may want to compile it with typst.
- Databases
- Programming Language and lib
- Host
npm run start [-- [options]]
-t, --type <type> type of dataset ["inat2017", "random", "grid", "cluster"] (default: "inat2017")
-c, --count <count> number of data points (default: "100000")
-r, --repeat <count> number of repeat (default: "1")
By default, this benchmark uses iNaturalist 2017's Fine Grained Geolocation Datasets (visipedia/fg_geo). Which contains 654,818 records of geolocation point.
File was placed at datasets/inat2017/inat2017_file_name_to_geo.csv
with header.
Format:
filename,latitude,longitude
We also provide 3 other runtime generated datasets:
- Random
- Points are totally placed by RNG.
- Grid
- Points separated evenly around the earth.
- Using Fibonacci sphere algorithm.
- Cluster
- Every 50 points will be placed together with a bit offset as a cluster, and all clusters will be distributed randomly.
- Data should be loaded into the database before running the queries.
- Warm up index is allowed.
- Warm up query is not allowed.
- Storage cost will be calculated.
- For memory-storage databases, both memory and persist storage cost will be calculated.
- In theory, all databases should return the same result.
Basic test requires all queries runs one by one in a single process/thread.
Advanced test allows queries to run in parallel, and allows to optimized for test host.
Pick a random location, find the closest location in the dataset.
Pick a random location from the dataset, find all locations within certain distance.
Pick a random location from the dataset, find all locations within certain distance, order by distance.
This project is licensed under the terms of the CC BY-SA 4.0 license.
To cite this report, check CITATION.cff.