Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: benchmark lance vs parquet read time, write time, and compressed size #2383

Merged
merged 8 commits into from
Aug 2, 2024

Conversation

raunaks13
Copy link
Contributor

@raunaks13 raunaks13 commented May 23, 2024

  1. Compares parquet and lance read time, write time and compressed file size for plain encodings (numeric, non-numeric, and timestamp) on tpch data
  2. Compares performance on vectors (fixed size list encoding) on sift data

@github-actions github-actions bot added the enhancement New feature or request label May 23, 2024
@raunaks13 raunaks13 requested a review from westonpace May 23, 2024 04:22
@raunaks13 raunaks13 marked this pull request as draft May 23, 2024 17:47
@raunaks13 raunaks13 changed the title feat: benchmarking encodings perf: benchmarking encodings May 24, 2024
@raunaks13 raunaks13 changed the title perf: benchmarking encodings perf: benchmarking encodings (read time, write time) May 24, 2024
@raunaks13 raunaks13 added benchmark and removed enhancement New feature or request labels May 24, 2024
@codecov-commenter
Copy link

codecov-commenter commented Jul 19, 2024

Codecov Report

Attention: Patch coverage is 58.87600% with 1383 lines in your changes missing coverage. Please review.

Project coverage is 79.45%. Comparing base (7d9dbda) to head (1923d73).
Report is 167 commits behind head on main.

Files Patch % Lines
rust/lance-encoding-datafusion/src/zone.rs 0.00% 349 Missing ⚠️
java/core/lance-jni/src/blocking_dataset.rs 0.00% 225 Missing ⚠️
rust/lance-encoding-datafusion/src/lib.rs 0.00% 120 Missing ⚠️
...t/lance-encoding/compression-algo/fsst/src/fsst.rs 85.86% 106 Missing and 11 partials ⚠️
java/core/lance-jni/src/blocking_scanner.rs 0.00% 87 Missing ⚠️
java/core/lance-jni/src/fragment.rs 0.00% 87 Missing ⚠️
rust/lance-encoding/src/decoder.rs 81.92% 55 Missing and 24 partials ⚠️
java/core/lance-jni/src/error.rs 0.00% 43 Missing ⚠️
rust/lance-datafusion/src/expr.rs 2.38% 41 Missing ⚠️
rust/lance-datagen/src/generator.rs 42.02% 40 Missing ⚠️
... and 16 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2383      +/-   ##
==========================================
- Coverage   80.69%   79.45%   -1.24%     
==========================================
  Files         192      214      +22     
  Lines       56193    62830    +6637     
  Branches    56193    62830    +6637     
==========================================
+ Hits        45344    49924    +4580     
- Misses       8221     9984    +1763     
- Partials     2628     2922     +294     
Flag Coverage Δ
unittests 79.45% <58.87%> (-1.24%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@github-actions github-actions bot added the python label Aug 1, 2024
@raunaks13 raunaks13 changed the title perf: benchmarking encodings (read time, write time) perf: benchmark lance v2 read and write time on tpch data Aug 1, 2024
@raunaks13 raunaks13 marked this pull request as ready for review August 1, 2024 00:35
@raunaks13 raunaks13 requested a review from wjones127 August 1, 2024 00:47
@raunaks13 raunaks13 changed the title perf: benchmark lance v2 read and write time on tpch data perf: benchmark lance vs parquet read time, write time, and compressed size Aug 1, 2024
Copy link
Contributor

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor cleanups but otherwise this seems like a nicely written tool.

python/python/benchmarks/test_v2_read_write.py Outdated Show resolved Hide resolved
python/python/benchmarks/test_v2_read_write.py Outdated Show resolved Hide resolved
python/python/benchmarks/test_v2_read_write.py Outdated Show resolved Hide resolved
@raunaks13 raunaks13 merged commit 6a100d7 into lancedb:main Aug 2, 2024
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants