Unify data model #53

seanshahkarami · 2017-09-15T02:09:08Z

I've been thinking some more about how we can combine a bit of what we have now into a simpler pipeline. A very reasonable approach would be to transition to a cassandra table with columns: ((nodeid , date), topic, timestamp), body

(Yes, topic is kind of just a semantic change from plugin. It'd basically be used to store any routing key, for example, coresense:3 or metric.).

Now, we could put a single "data" exchange in beehive accepting all messages like this. If it's a direct exchange, we can then do a simple "opt-in" for each topic we want to store in that database.

The other nice thing about this layout is it supports splitting messages by topic from the database. Generally, you always end up having to handle each topic case-wise, so having the database support this would be great. At the moment, we can't do that without manually filtering. This should also allow better time slicing within a single day.

This is also general enough that we don't need any special code handling things at the front - we just grab data, maybe add a received_at timestamp, and shove it in the database from later processing. This eliminates the need to do any data format changing since all that has to be handled on a case-by-case basis anyway but ensures the storage (and backup) problem is handled uniformly.

Another way to think of this is as simply as a permanent message log which can be replayed for later processing. The nice thing is, this can be designed as a configurable service in the sense that a binding for each topic can be adding to any exchange and you'll automatically start getting backups.

The text was updated successfully, but these errors were encountered:

seanshahkarami · 2017-09-15T03:06:44Z

I setup up a mock-up of this on beehive. On a test node, I have metrics being sent this way and having them stored in this table. I think this solves the "common intake" problem. Since all data is likely to be handled on a case-by-case basis anyway, I don't think there's anything to do for this step other than transition the other plugin data into the location too.

seanshahkarami added this to the Beehive "Good Enough" Candidate milestone Sep 15, 2017

seanshahkarami self-assigned this Sep 15, 2017

seanshahkarami added the data label Sep 15, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify data model #53

Unify data model #53

seanshahkarami commented Sep 15, 2017 •

edited

Loading

seanshahkarami commented Sep 15, 2017

Unify data model #53

Unify data model #53

Comments

seanshahkarami commented Sep 15, 2017 • edited Loading

seanshahkarami commented Sep 15, 2017

seanshahkarami commented Sep 15, 2017 •

edited

Loading