Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sstable: use github.com/klauspost/compress for snappy #3693

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jbowens
Copy link
Collaborator

@jbowens jbowens commented Jun 24, 2024

Use the github.com/klauspost/compress module's Snappy implementation. It's
faster (10-20%) and generates comparable compressed sizes. Although it obeys
the same format as Snappy and is bi-directionally compatible with Google's
Snappy implementation, it does not produce identical payloads (claiming to
produce slightly smaller payloads).

                                                                                │   old.txt    │               new.txt                │
                                                                                │     B/s      │     B/s       vs base                │
Writer/format=(Pebble,v2)/block=4.0KB/filter=true/compression=NoCompression-24    297.7Mi ± 0%   298.4Mi ± 1%        ~ (p=0.093 n=10)
Writer/format=(Pebble,v2)/block=4.0KB/filter=true/compression=Snappy-24           70.44Mi ± 1%   62.62Mi ± 0%  -11.10% (p=0.000 n=10)
Writer/format=(Pebble,v2)/block=4.0KB/filter=true/compression=ZSTD-24             13.66Mi ± 1%   13.67Mi ± 1%        ~ (p=0.698 n=10)
Writer/format=(Pebble,v2)/block=4.0KB/filter=false/compression=NoCompression-24   421.3Mi ± 0%   420.3Mi ± 0%   -0.23% (p=0.050 n=10)
Writer/format=(Pebble,v2)/block=4.0KB/filter=false/compression=Snappy-24          85.63Mi ± 0%   72.42Mi ± 1%  -15.42% (p=0.000 n=10)
Writer/format=(Pebble,v2)/block=4.0KB/filter=false/compression=ZSTD-24            12.09Mi ± 1%   12.22Mi ± 1%        ~ (p=0.108 n=10)
Writer/format=(Pebble,v2)/block=32KB/filter=true/compression=NoCompression-24     304.1Mi ± 0%   306.4Mi ± 0%   +0.74% (p=0.000 n=10)
Writer/format=(Pebble,v2)/block=32KB/filter=true/compression=Snappy-24            55.25Mi ± 0%   46.94Mi ± 0%  -15.05% (p=0.000 n=10)
Writer/format=(Pebble,v2)/block=32KB/filter=true/compression=ZSTD-24              15.60Mi ± 1%   15.70Mi ± 2%        ~ (p=0.197 n=10)
Writer/format=(Pebble,v2)/block=32KB/filter=false/compression=NoCompression-24    437.5Mi ± 0%   435.8Mi ± 0%   -0.38% (p=0.000 n=10)
Writer/format=(Pebble,v2)/block=32KB/filter=false/compression=Snappy-24           65.57Mi ± 0%   52.02Mi ± 0%  -20.66% (p=0.000 n=10)
Writer/format=(Pebble,v2)/block=32KB/filter=false/compression=ZSTD-24             12.07Mi ± 1%   12.09Mi ± 1%        ~ (p=0.254 n=10)
Writer/format=(Pebble,v3)/block=4.0KB/filter=true/compression=NoCompression-24    281.1Mi ± 0%   282.1Mi ± 0%   +0.34% (p=0.023 n=10)
Writer/format=(Pebble,v3)/block=4.0KB/filter=true/compression=Snappy-24           65.62Mi ± 0%   58.64Mi ± 0%  -10.65% (p=0.000 n=10)
Writer/format=(Pebble,v3)/block=4.0KB/filter=true/compression=ZSTD-24             12.97Mi ± 2%   13.17Mi ± 1%   +1.47% (p=0.019 n=10)
Writer/format=(Pebble,v3)/block=4.0KB/filter=false/compression=NoCompression-24   387.8Mi ± 0%   388.9Mi ± 0%   +0.27% (p=0.005 n=10)
Writer/format=(Pebble,v3)/block=4.0KB/filter=false/compression=Snappy-24          78.29Mi ± 0%   67.37Mi ± 0%  -13.95% (p=0.000 n=10)
Writer/format=(Pebble,v3)/block=4.0KB/filter=false/compression=ZSTD-24            11.46Mi ± 3%   11.71Mi ± 2%   +2.16% (p=0.000 n=10)
Writer/format=(Pebble,v3)/block=32KB/filter=true/compression=NoCompression-24     285.8Mi ± 0%   287.8Mi ± 0%   +0.68% (p=0.000 n=10)
Writer/format=(Pebble,v3)/block=32KB/filter=true/compression=Snappy-24            51.15Mi ± 0%   43.94Mi ± 0%  -14.10% (p=0.000 n=10)
Writer/format=(Pebble,v3)/block=32KB/filter=true/compression=ZSTD-24              14.77Mi ± 1%   14.93Mi ± 0%   +1.03% (p=0.001 n=10)
Writer/format=(Pebble,v3)/block=32KB/filter=false/compression=NoCompression-24    402.4Mi ± 0%   403.0Mi ± 0%        ~ (p=0.123 n=10)
Writer/format=(Pebble,v3)/block=32KB/filter=false/compression=Snappy-24           59.84Mi ± 0%   48.51Mi ± 0%  -18.93% (p=0.000 n=10)
Writer/format=(Pebble,v3)/block=32KB/filter=false/compression=ZSTD-24             11.42Mi ± 1%   11.56Mi ± 1%   +1.29% (p=0.000 n=10)
geomean                                                                           66.51Mi        63.24Mi        -4.91%

jbowens added 2 commits June 24, 2024 12:56
Update the github.com/klauspost/compress module to v1.17.9.
Use the github.com/klauspost/compress module's Snappy implementation. It's
faster (10-20%) and generates comparable compressed sizes. Although it obeys
the same format as Snappy and is bi-directionally compatible with Google's
Snappy implementation, it does not produce identical payloads (claiming to
produce slightly smaller payloads).

```
                                                                                │   old.txt    │               new.txt                │
                                                                                │     B/s      │     B/s       vs base                │
Writer/format=(Pebble,v2)/block=4.0KB/filter=true/compression=NoCompression-24    297.7Mi ± 0%   298.4Mi ± 1%        ~ (p=0.093 n=10)
Writer/format=(Pebble,v2)/block=4.0KB/filter=true/compression=Snappy-24           70.44Mi ± 1%   62.62Mi ± 0%  -11.10% (p=0.000 n=10)
Writer/format=(Pebble,v2)/block=4.0KB/filter=true/compression=ZSTD-24             13.66Mi ± 1%   13.67Mi ± 1%        ~ (p=0.698 n=10)
Writer/format=(Pebble,v2)/block=4.0KB/filter=false/compression=NoCompression-24   421.3Mi ± 0%   420.3Mi ± 0%   -0.23% (p=0.050 n=10)
Writer/format=(Pebble,v2)/block=4.0KB/filter=false/compression=Snappy-24          85.63Mi ± 0%   72.42Mi ± 1%  -15.42% (p=0.000 n=10)
Writer/format=(Pebble,v2)/block=4.0KB/filter=false/compression=ZSTD-24            12.09Mi ± 1%   12.22Mi ± 1%        ~ (p=0.108 n=10)
Writer/format=(Pebble,v2)/block=32KB/filter=true/compression=NoCompression-24     304.1Mi ± 0%   306.4Mi ± 0%   +0.74% (p=0.000 n=10)
Writer/format=(Pebble,v2)/block=32KB/filter=true/compression=Snappy-24            55.25Mi ± 0%   46.94Mi ± 0%  -15.05% (p=0.000 n=10)
Writer/format=(Pebble,v2)/block=32KB/filter=true/compression=ZSTD-24              15.60Mi ± 1%   15.70Mi ± 2%        ~ (p=0.197 n=10)
Writer/format=(Pebble,v2)/block=32KB/filter=false/compression=NoCompression-24    437.5Mi ± 0%   435.8Mi ± 0%   -0.38% (p=0.000 n=10)
Writer/format=(Pebble,v2)/block=32KB/filter=false/compression=Snappy-24           65.57Mi ± 0%   52.02Mi ± 0%  -20.66% (p=0.000 n=10)
Writer/format=(Pebble,v2)/block=32KB/filter=false/compression=ZSTD-24             12.07Mi ± 1%   12.09Mi ± 1%        ~ (p=0.254 n=10)
Writer/format=(Pebble,v3)/block=4.0KB/filter=true/compression=NoCompression-24    281.1Mi ± 0%   282.1Mi ± 0%   +0.34% (p=0.023 n=10)
Writer/format=(Pebble,v3)/block=4.0KB/filter=true/compression=Snappy-24           65.62Mi ± 0%   58.64Mi ± 0%  -10.65% (p=0.000 n=10)
Writer/format=(Pebble,v3)/block=4.0KB/filter=true/compression=ZSTD-24             12.97Mi ± 2%   13.17Mi ± 1%   +1.47% (p=0.019 n=10)
Writer/format=(Pebble,v3)/block=4.0KB/filter=false/compression=NoCompression-24   387.8Mi ± 0%   388.9Mi ± 0%   +0.27% (p=0.005 n=10)
Writer/format=(Pebble,v3)/block=4.0KB/filter=false/compression=Snappy-24          78.29Mi ± 0%   67.37Mi ± 0%  -13.95% (p=0.000 n=10)
Writer/format=(Pebble,v3)/block=4.0KB/filter=false/compression=ZSTD-24            11.46Mi ± 3%   11.71Mi ± 2%   +2.16% (p=0.000 n=10)
Writer/format=(Pebble,v3)/block=32KB/filter=true/compression=NoCompression-24     285.8Mi ± 0%   287.8Mi ± 0%   +0.68% (p=0.000 n=10)
Writer/format=(Pebble,v3)/block=32KB/filter=true/compression=Snappy-24            51.15Mi ± 0%   43.94Mi ± 0%  -14.10% (p=0.000 n=10)
Writer/format=(Pebble,v3)/block=32KB/filter=true/compression=ZSTD-24              14.77Mi ± 1%   14.93Mi ± 0%   +1.03% (p=0.001 n=10)
Writer/format=(Pebble,v3)/block=32KB/filter=false/compression=NoCompression-24    402.4Mi ± 0%   403.0Mi ± 0%        ~ (p=0.123 n=10)
Writer/format=(Pebble,v3)/block=32KB/filter=false/compression=Snappy-24           59.84Mi ± 0%   48.51Mi ± 0%  -18.93% (p=0.000 n=10)
Writer/format=(Pebble,v3)/block=32KB/filter=false/compression=ZSTD-24             11.42Mi ± 1%   11.56Mi ± 1%   +1.29% (p=0.000 n=10)
geomean                                                                           66.51Mi        63.24Mi        -4.91%
````
@jbowens jbowens requested a review from a team as a code owner June 24, 2024 17:37
@jbowens jbowens requested a review from itsbilal June 24, 2024 17:37
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find!

We should add the golang snappy to TestForbiddenImports so we don't inadvertently switch back to it. We can also add a comment there explaining that it's a compatible but faster implementation.

@petermattis
Copy link
Collaborator

Ditto on the nice find! Is the speedup also present on Arm? Did you also look at decompression speed?

@jbowens
Copy link
Collaborator Author

jbowens commented Jun 26, 2024

There's an issue here that the library doesn't seem to produce the same output on all platforms, and our tests depend on determinism of compression across platforms. I will return to this and try to dig a little deeper to where the platform dependence is coming from—it seems surprising and undesirable.

@RaduBerinde
Copy link
Member

Interesting.. What platform did you use to generate them? If it was MacOS, I would expect the go-macos test to pass and others to fail?

@jbowens
Copy link
Collaborator Author

jbowens commented Jun 26, 2024

Yeah, I generated locally on my arm mac. I think it must be arch dependent, and the go-macos action uses an intel mac

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants