Skip to content
This repository has been archived by the owner on Dec 6, 2023. It is now read-only.

Gzip compression #40

Merged
merged 28 commits into from
Feb 21, 2017
Merged

Conversation

ajantis
Copy link
Collaborator

@ajantis ajantis commented Jan 18, 2017

Goal

Gzip compression allows to greatly reduce the amount of traffic flowing from/to Riak clients to/from the database. It's especially helpful when there are conflicting values ("siblings") in Riak: in most cases, they are not that much different, so we can benefit from the standard data deduplication-based compression algorithms (such as Gzip).
Quick measurements demonstrated that the payload can be reduced 8-10 times with gzipping enabled.

UPDATE: Requests payload compression has been reverted and put out-of-the-scope for now.
This is due to a number of known (by now) issues like #41, #42, #43...
Also store object requests compression makes data stored this way in Riak incompatible with the old Riak Scala Client version (old clients won't be able to read it from Riak as it will respond with compressed responses by default, even if Accept-Encoding header has not been specified... see #42 in particular for details).

Changes

This PR introduces a toggleable gzip compression (via a flag in the reference.conf) for requests' and responses' payloads.
Riak multipart responses (i.e. for 'siblings' cases) are also handled properly.

UPDATE: requests compression is put out-of-the-scope.

Caveats

The only place where it didn't work out-of-the-box is http PUT /bucket/<name>/props endpoint. For some reason, this endpoint doesn't handle gzipped payload properly.
Because of that, compression has been disabled for requests targeting this endpoint.

UPDATE: this is not an issue anymore as requests compression is disabled altogether.

It is added as part of the standard ‘compress’ directive.
For some reason, Riak PUT bucket properties endpoint doesn’t handle gzipped requests properly. Given that set bucket properties requests are usually rare (compared to get/store requests) and their payload is reasonably small, it should be good enough to disable compression for them.
@coveralls
Copy link

coveralls commented Jan 18, 2017

Coverage Status

Coverage decreased (-0.3%) to 80.495% when pulling 8af47d9 on ajantis:optional-http-compression into 54f742a on agemooij:master.

@coveralls
Copy link

coveralls commented Jan 19, 2017

Coverage Status

Coverage increased (+0.2%) to 81.056% when pulling 59a8157 on ajantis:optional-http-compression into 54f742a on agemooij:master.

Copy link
Owner

@agemooij agemooij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from a functional standpoint.
I would suggest to add some tests that specifically test the compression scenarios and the known corner case with the bucket properties setup.

@@ -22,6 +22,9 @@ riak {
#
# Riak server designates values as tombstones by adding an optional 'X-Riak-Deleted' header.
ignore-tombstones = yes

# Should the client use an compression (e.g. Gzip) when talking to Riak via HTTP.
enable-http-compression = yes
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to default this to no for backwards compatibility.

@@ -165,7 +164,9 @@ private[riak] class RiakHttpClientHelper(system: ActorSystem) extends RiakUriSup

val entity = JsObject("props" -> JsObject(newProperties.map(property ⇒ (property.name -> property.json)).toMap))

httpRequest(Put(PropertiesUri(server, bucket), entity)).map { response ⇒
// For some reason, Riak set bucket props HTTP endpoint doesn't handle compressed request properly.
// So we disable compression for this request unconditionally.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create an issue for this and reference it here so we can keep track of it. Perhaps also create an issue for Riak itself and link the two?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep. Done. It's this one: #41
we have also found another interesting issue: #42

@ajantis
Copy link
Collaborator Author

ajantis commented Jan 19, 2017

Thanks for a swift review and feedback, Age! ;)
Yep, applying it.

@coveralls
Copy link

coveralls commented Jan 19, 2017

Coverage Status

Coverage decreased (-2.3%) to 78.571% when pulling a7a8600 on ajantis:optional-http-compression into 54f742a on agemooij:master.

@coveralls
Copy link

coveralls commented Jan 25, 2017

Coverage Status

Coverage increased (+0.2%) to 81.056% when pulling 6cf73cf on ajantis:optional-http-compression into 54f742a on agemooij:master.

…hen using a mixture of compression-enabled and compression-disabled clients

Yes, surprisingly these tests may fail with Riak o_O
…ected gzip responses

This happens when a value was stored with `Content-Encoding: gzip` and then it is fetched without requiring gzip (no `Accept-Encoding` header or with a `Accept-Encoding: identity` header). This is a reaaaally strange behaviour.
@coveralls
Copy link

coveralls commented Jan 26, 2017

Coverage Status

Coverage increased (+0.7%) to 81.505% when pulling 72cc888 on ajantis:optional-http-compression into 54f742a on agemooij:master.

@coveralls
Copy link

coveralls commented Jan 27, 2017

Coverage Status

Coverage increased (+0.7%) to 81.505% when pulling e2ca24c on ajantis:optional-http-compression into 54f742a on agemooij:master.

@ajantis
Copy link
Collaborator Author

ajantis commented Jan 27, 2017

All RiakClient-related tests have been parametrised and are executed twice now: with and without compression.
Also a few special test-cases were added to cover a workaround for issue #42 : it's in RiakGzipSpec.

@coveralls
Copy link

coveralls commented Jan 27, 2017

Coverage Status

Coverage increased (+0.7%) to 81.505% when pulling dc4053f on ajantis:optional-http-compression into 54f742a on agemooij:master.

… handling concurrent delete operations on single values
@coveralls
Copy link

coveralls commented Jan 30, 2017

Coverage Status

Coverage increased (+1.2%) to 81.988% when pulling 7c67842 on ajantis:optional-http-compression into 54f742a on agemooij:master.

Due to a number of Riak shortcomings in regards to handling gzipped requests, it seems safer to simply disable requests compression altogether for now.
@ajantis
Copy link
Collaborator Author

ajantis commented Jan 30, 2017

In the end, I reverted requests payload compression for now. :-\ (see updated PR description)
This causes quite some problems and makes data stored that way incompatible with old versions of riak scala client (because of #42, Riak starts returning compressed responses for that data by default...).

Apparently, that rare issue is still reproducible (see builds failures) even with clients that don’t compress requests… :-\
@coveralls
Copy link

coveralls commented Jan 31, 2017

Coverage Status

Coverage increased (+0.5%) to 81.366% when pulling 8d5d9fe on ajantis:optional-http-compression into 54f742a on agemooij:master.

@coveralls
Copy link

coveralls commented Jan 31, 2017

Coverage Status

Coverage increased (+0.5%) to 81.366% when pulling 1bfb428 on ajantis:optional-http-compression into 54f742a on agemooij:master.

@coveralls
Copy link

coveralls commented Jan 31, 2017

Coverage Status

Coverage increased (+0.5%) to 81.366% when pulling 6401372 on ajantis:optional-http-compression into 54f742a on agemooij:master.

@agemooij
Copy link
Owner

So what is the status of this PR now? After all the problems with request compression, does this still make sense to add? If so, what are the things people should watch out for? And shouldn't we document those?

@ajantis
Copy link
Collaborator Author

ajantis commented Feb 19, 2017

I gave up on making request compression to work (too many brittle workarounds), so now it only works for responses.
We are already using this change in prod at TT (no issues so far).
So I was hoping to just contribute/merge it as it is now. =)

Sure, I can document it somewhere additionally (there are already code comments here and there + a comment for that 'compression toggle' in the reference conf file).... Readme?

@agemooij
Copy link
Owner

I think the only thing missing is some clear documentation on what it does and doesn't do in the README. Otherwise it might be confusing to users, right?

@ajantis
Copy link
Collaborator Author

ajantis commented Feb 21, 2017

@agemooij yep, agree. I added some info to README with issue references.

@coveralls
Copy link

coveralls commented Feb 21, 2017

Coverage Status

Coverage increased (+0.5%) to 81.366% when pulling 59931b7 on ajantis:optional-http-compression into 54f742a on agemooij:master.

@agemooij agemooij merged commit a09fe96 into agemooij:master Feb 21, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants