Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS S3 'database' support #61

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

sooshie
Copy link

@sooshie sooshie commented Feb 24, 2016

The ability to use AWS S3 as a data store that drives the UI instead of ES or Mongo. Has the advantage of being less system maintenance.

There is an AWS reporting plugin that takes care of writing the data to S3, and cleaning up after itself (if configured). And then the web component views.py was changed to take advantage of the data in the various S3 buckets.

@spender-sandbox
Copy link
Owner

Rather than duplicating a lot of the code, could we instead have some AWS object which would pull in its own config info, and then having some global AWS object for that views.py, you could just call aws_obj.get(objname) or aws_obj.update(objname, newval) as appropriate?

@spender-sandbox
Copy link
Owner

Also, we'd have to do something about the code that appears to be doing an S3 request for every single API call

@spender-sandbox
Copy link
Owner

The way I envision this being used would be to export older analyses to S3 while keeping recent ones fast, basically as an alternative to using the retention module. If we can have Cuckoo track which tasks have been exported, then that AWS object could do some fast check to see if the analysis exists on S3 at all without having to do requests each time to look for it.

@sooshie
Copy link
Author

sooshie commented Feb 25, 2016

The object idea is pretty interesting, I didn't want to get too crazy with all the code so I opt'd for the easy route. I'm going to have to dig into the S3 request for every API call. This kind of goes w/the prior I didn't feel like re-writing giant chunks of code so I mimic'd the paging layout that's in the current setup, but I'm guessing there's probably a logic issue that causes a request for every API call.

The main goal for all this was to set something up so I wouldn't have to admin a mongo cluster any more. I like the idea of having an S3 retention module as well, a kind of tiered approach, it'd make sense for a lot of people. Happy to help with some of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants