Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save 75% of platform message size 🤦🏻‍♂️ #80

Open
zubairov opened this issue Feb 8, 2019 · 9 comments
Open

Save 75% of platform message size 🤦🏻‍♂️ #80

zubairov opened this issue Feb 8, 2019 · 9 comments

Comments

@zubairov
Copy link
Contributor

zubairov commented Feb 8, 2019

This place here returns a string with base64 encoded string in it. Base64 binary is roughly 75% larger than binary data. The publish method that sends data to the queue accepts a binary Buffer so it makes little sense to send a based64 string over it.
BTW publish method copies the data, so data copied 3 times - here, here when new Buffer is created and here when data is published to the queue

The main problem here when fixing it - maintain backwards compatibility.

@jhorbulyk
Copy link

The main problem here when fixing it - maintain backwards compatibility.

All sailor versions need to be updated at the same time. :(

@ghaiklor
Copy link
Contributor

ghaiklor commented Feb 13, 2019

@zubairov @jhorbulyk either I did not understand the problem, but I think we can make it backward compatible. It is a little bit stinky, but should work.

So, if you want to have a possibility to exchange both base64 or Buffer, you can predict the format of the message you got.

Pseudo-code

function decryptIV(encData, options) {
    const isBase64 = Buffer.from(encData, 'base64').toString('base64') === encData;
    const inputFormat = isBase64 ? 'base64' : 'buffer';
    const result = cipher.update(encData, inputFormat, 'utf-8') + cipher.final('utf-8');

    return result;
}

The idea behind this trick is simple:

  1. You got a message from rabbit (it can be both base64 or buffer, you don't know for sure)
  2. You are trying to decode message as base64 -> encode back to base64 and IF encoded data equals to encoded messages - we can be sure that is base64
  3. Otherwise, it message was not base64, it can not be decoded/encoded properly and definitely will not be equal to the source message. In that case, we are assuming that message was sent in Buffer format.

Now, regard to compatibility:

New sailor versions in encrypt would use Buffer format, so all the messages will be sent in Buffer format.
Plus, decrypt method will have a mechanism to determine the format of message and handle appropriate (as described above).

Old sailor versions in encrypt would use base64 format and decrypt would use base64 as well.

Another case, when message is sent from old to new sailor:

Old encrypt method will send base64 format, but since our new sailor is capable of determine that it is base64 it will just decrypt it as before. Otherwise, it will take it as a buffer if message sent from new sailor.

Hope I'm right about the problem you saying

@zubairov
Copy link
Contributor Author

@ghaiklor ok, what happens when "new" version send a message in Buffer format to "old" sailor?

@ghaiklor
Copy link
Contributor

ghaiklor commented Feb 13, 2019

@zubairov dope, I didn't think about this one, shame on me 😆

We could ignore this issue for now and just increase the limits for pods where sailor is running, but... it is a short-term solution.

@zubairov
Copy link
Contributor Author

It's not really a pod related but RabbitMQ related issue :) but yes, we need to build a message versioning support into sailor ASAP to make sure we are protected from such issues in future.

@ghaiklor
Copy link
Contributor

RabbitMQ has no issues with handling big messages. What do you mean by that ?

@zubairov
Copy link
Contributor Author

Yes, correct, my statement should be - we produce unnecessary load on RabbitMQ due to sailor implementation details

@ghaiklor
Copy link
Contributor

I'm going to think about any other possible solutions for problem with huge messages, but for now we have the following facts:

  • Most of the steps have been killed because of Out Of Memory
  • seems like OOM is produced by sailor due to dirty implementation (do we have the same problem with Java sailor?)
  • it has nothing to do with RabbitMQ, since its amqp protocol is based on tcp and all the "split-send" stuff is happening under the hood

For now, I do not see any ways to solve it without touching sailor (expect increasing memory limits).

@zubairov
Copy link
Contributor Author

@ghaiklor I don't think it's the solution for huge messages, only a minor efficiency gain, not more than that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants