Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find a more efficient way to pass the Tripod config to the process queue jobs #83

Open
rsinger opened this issue Jul 22, 2015 · 7 comments

Comments

@rsinger
Copy link
Member

rsinger commented Jul 22, 2015

We're currently passing in a Tripod config array with every background job which places a massive overhead for the amount of jobs Resque can handle.

If, rather than passing the config as part of the job, we stored the config, keyed by a hash of the config, in redis, we would have a much smaller memory footprint and could subsequently queue a lot more jobs.

@rsinger
Copy link
Member Author

rsinger commented Jul 22, 2015

I don't think we'd need to change the footprint of the job data: if the tripodConfig value is an array, assume that the config is being passed directly, if it's a string, assume that's a key to look up in redis.

@rsinger
Copy link
Member Author

rsinger commented Jul 22, 2015

Just a really quick investigation into this found that for regular Tripod job, we pass in a JSON string that's around 33,602 characters long (depending on the store).

If we gzcompress that JSON string, it's 5,087 characters long. I'm actually going to recommend that we Base64 that gzcompressed string (6,785 characters) so we don't gum up the Resque web interface too badly. It's a longer string, but I think it would pay off in the end.

Anyway, if my math is right (it's probably not!), we should be able to put about 5x as many jobs on the queue for the same memory footprint.

@rsinger
Copy link
Member Author

rsinger commented Jul 22, 2015

I kind of feel like there must be some way to get that original config size down, as well.

@scaleupcto
Copy link
Contributor

Yup agree, original config is massive and I have been thinking about that for a while now.

Just a thought - does converting it to YAML and then compressing that buy us anything?

Also is it possible to cherry pick - at least on the Apply jobs - just enough config to get the work done?

@rsinger
Copy link
Member Author

rsinger commented Jul 22, 2015

Given that there's no native YAML support in PHP, anyway, shoot for the moon. Open the possibilities to Thrift or protobuf.

@scaleupcto
Copy link
Contributor

Filesize is > if I convert to YAML anyway, due to whitespace, so scratch that. Partial config could work although gzipping the JSON seems like a good start.

What do we gain from the Base64'ing?

@scaleupcto
Copy link
Contributor

Actually, I just cat'ed a gzip, see what you mean!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants