-
Notifications
You must be signed in to change notification settings - Fork 381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dockerize Every Analyzer #607
Conversation
Note: |
Hello @milesflo Your changes look great but I don't know how do you want us to accept a PR that changes almost 300 files? How are we supposed to review that? how much time are we supposed to put in reviewing that? Anyways |
@nadouani Hello! I included a script at I also fixed a lot of things that were broken in the recent commit that forced the jump from python 2 => 3... Between that, and the changes to JSON schemas that just tweak the version numbers, it's not as much as it looks. |
The PR here is supposed to juste play with things related to Some of your changes have already been made by 2.4.0 for example. And in general, the rule is here: https://github.com/TheHive-Project/CortexDocs/blob/master/api/how-to-create-an-analyzer.md#create-a-pull-request |
RE that last commit message, that was a bit of frustration on my part trying to debug the Abuse_Finder analyzer... turns out the dependency tree is broken! 🎉 joepie91/python-whois#151 (comment) Fixed by specifying an older version of the dep |
So within a PR that creates a tool to automate dockerfile creation for analyzers, you fix something else! (abuse_finder). Not sure this is a right way. |
@nadouani Can't disagree with you that the scope of this one crept up... Thank the test script & packaging for that... I didn't feel comfortable leaving anything broken once I knew it was there. 😄 I could always abstract these out into different PRs if you think that'll be less work for the team... I started to do that with Issues created & referenced back to this. |
@@ -14,39 +14,45 @@ | |||
# See the License for the specific language governing permissions and | |||
# limitations under the License. | |||
|
|||
# Python3 compatability by: https://github.com/guyddr/dnsdb-query |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a Python3 version of this vendor script we imported from elsewhere. Link listed here.
@@ -0,0 +1,185 @@ | |||
#!/usr/bin/env python3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nadouani Hey it might help if you start from here... This is the script that I used to build those dockerfiles.
It uses the relevant requirements.txt
to create a list of Alpine dependencies, then updates the file if any new changes are detected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeromeleonard Psst over here ☝️
if requirements_path.exists(): | ||
requirements = requirements_path.open().read() | ||
|
||
if 'yara-python' in requirements: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the real meat here... 'dependency analysis' might be a bit generous considering how iterative the approach here is. Thankfully a lot of these use the same, common reversing libraries.
dockerfile = dockerfile.read() | ||
|
||
# Dockerfiles with this string will be frozen; skip. | ||
if '### MANUAL ###' in dockerfile: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you wanna freeze a Dockerfile in time, add this string anywhere (preferably the first line for clarity).
Used this as a compromise for when an analyzer is just too niche to build programatically.
@@ -0,0 +1,64 @@ | |||
#!/usr/bin/env python3 | |||
|
|||
# Build analyzers from Dockerfiles locally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script is made a bit redundant by the build_analyzers test suite... The idea was that you could use this script to, as quickly as possible (via concurrency), build all the docker images locally.
|
||
success_msg = '{"success": false, "input": {}, "errorMessage": "Input file doesnt exist"}' | ||
|
||
config_required_msg=""" File "/usr/local/lib/python3.8/site-packages/cortexutils/worker.py", line 31, in __init__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error message is produced by executing an Analyzer without piping in a relevant job config. Despite it being a stack trace, its presence indicates that the Python interpreter did not detect any syntax errors or missing libraries-- a success in terms of packaging for deployment.
dockerfile_contents.append('RUN apk add --no-cache {}\n'.format(' '.join(sorted(alpine_dependencies)))) | ||
|
||
|
||
labels = dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here down is the LABEL
metadata tags. Analyzers with multiple authors will have an authors
label and a CSV of the contributors' names in alphabetical order.
@nadouani Logging off as I’m on the west coast. Your timing is good— that last push was the final change needed to make sure that all of the analyzer tests work. Assume this is the final state unless you need edits. |
I'm opening a PR for each of these Analyzers as your contribution guide requested @nadouani |
Hey man I didn't ask for that. When did I say "create a PR par docker file"? I've just said that you have fixed python3 related issues on a PR dealing with the Docker stuff. That's all. And note that this stuff has a lower priority for what we have to implement for TheHive/Cortex etc... We won't review this immediately. |
Breakout:This commit contains Dockerfiles for all anayzers-- the superset for this PR being the following PRs: |
You said the scale of this PR was too large. You linked the docs which say:
I'm following your lead here... I don't know what else you could have been addressing. Anyways |
fb8f5aa
to
23be632
Compare
Hello @milesflo , can I ask if the current docker situation for analyzer is ok or if there are still issues? |
@dadokkio Got you, automated closing them all |
Addresses: #606
Changes:
Convert analyser JSON file versions from loose versioning schema to semver standard
Creation of 2 helper scripts for Dockerfile mgmt
utils/dockerfile_builder.py
baseImage
in config or derived from source code shebang)command
key)utils/build_analyzers.py
Inclusion of built
Dockerfile
sMiscellaneous house keeping
git
pip sources should be used as a last resource in favor of PyPiNext Steps:
docker image push
command on each of these built images at every major release, utilizing the catalog system already in place to act as adockerImage
name directory.Known Issues:
Currently, some of the Analyzers have niche dependencies that break the automated Dockerfile generation logic... We can either change the builder script, or redesign the analyzers to use more native APIs.
Broken Analyzers:gcc does not ship withpython:2-alpine
gcc does not ship withpython:3-alpine
gcc does not ship withpython:3-alpine
This one has a bunch of errors... May want to write a whole edge case just for this one analyzer.. Some apt-level dependencies that probably can't be imported from another base imagepython:3-alpine
with thelibfuzzy-dev
library.