Skip to content
This repository has been archived by the owner on Sep 2, 2020. It is now read-only.

Production deployment sizes #221

Open
genericgithubuser opened this issue Jul 23, 2016 · 6 comments
Open

Production deployment sizes #221

genericgithubuser opened this issue Jul 23, 2016 · 6 comments

Comments

@genericgithubuser
Copy link

I would like to hear of anyone who is using Cyanite in a production environment, and how many datapoints per minute they are supporting with the configuration.

We are re-looking to migrate away from graphite, and cyanite is one of the main directions we're looking. We have a medium sized graphite installation now, only doing ~3.5 million datapoints per minute. As we look to move to a better backend, internally there is a lot of concern with going to cyanite since
a. it is still not at a 1.0 release
b. it's hard to tell if it has been used (and to what extent) in production and at what scale.

I've seen the continued improvements made on the project. Unfortunately when I've met people who have ran it in "production", each time it is as a secondary system, and is really just in poc mode even thought they have prod systems writing to it.

So I know this is the wrong place for a question like this, but I think it is probably the best way for me to get a good answer. Can anyone who is actually running this in prod give an idea of what scale they're running at, and how long they've been running it?

@ifesdjeen
Copy link
Collaborator

The major "problem" when using cyanite is to handle write loads. In order to do that, right now one just has to shard. Other than that - Cyanite is more or less just a proxy for Cassandra reads, which is much simpler to measure.

3,5M is rougly 60K per second, which is reduced in terms of writes to maybe 10K writes, which sounds like not that much for Cassandra.

Cyanite is still very young and I'd expect a couple of bumps here and there. So I'd say if you're interested in making the project better, we're quite a cooperative crowd and try to react to everything people say asap (I hesitated to respond to this particular issue since I thought users would chime in so I wanted to leave more space for them) and help out in case something's wrong.

Performance, stability and correctness are among our primary goals.

@ehlerst
Copy link

ehlerst commented Aug 19, 2016

We are trying to get a POC off the ground that would handle 40 million per minute (what our current graphite is handling). So far with the latest master we are not having much luck at all to even maintain our 333k metric per min test lab. I think we jumped in during a big change to cyanite removing the elasticsearch/in memory index. There is another post where somebody with 0.1.3 was pushing 15 million/min however that involved the old Elasticsearch index which is gone now.

I am trying to get this working, ill update here if i get somewhere.

@ifesdjeen
Copy link
Collaborator

Just wondering, you handle 40M per minute (500k+) with a single graphite instance? @ehlerst

@ehlerst
Copy link

ehlerst commented Aug 19, 2016

No way, there are allot of graphite instances. The best we have is a single graphite server handling ~2 million.

@ifesdjeen
Copy link
Collaborator

ifesdjeen commented Aug 19, 2016

Right. Do you multiple Cyanite instances though?.. (I'm just trying to figure out the numbers I should at least try tuning for).

If the index is a major bottleneck, I might try rolling out "yet another" experimental index, but I'd like to make sure I understand what's going on :)

@ehlerst
Copy link

ehlerst commented Aug 19, 2016

Right now we have a reader and writer. However the writer never goes past 50% cpu usage so we havent looked at expanding to more. The funny thing is when you watch jvisualvm the heap is using only about 300mb of space out of 8gb at any given time. We are building a better way to see whats going on now. carbon-c-relay should show us better metrics on volume.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants