-
-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workflow from Phishing => Phishing.Database => VirusTotal #395
Comments
Copy of: #391 (comment) by @spirillen That is an interesting observation, and for sure something this project should be following up on. But where to run the thread?... Allow me to think about this one for a while and I'll try to find the right locate for this question, but for you observation about @PyFunceble it should not be the issue when we are in the case of adding records, that could be a case while testing for removal of outdated records |
Copy of #391 (comment) by @g0d33p3rsec That is an interesting observation, and for sure something this project should be following up on. But where to run the thread?... Allow me to think about this one for a while and I'll try to find the right locate for this question, but for you observation about @PyFunceble it should not be the issue when we are in the case of adding records, that could be a case while testing for removal of outdated records |
Copy of #391 (comment) by @g0d33p3rsec
Awesome, thanks for following up. I'll try to look into the upstream workflow and automation more once my schedule lightens up next week. I think the conversation would probably belong as an issue if the discussion needs to be in the open. On the other hand, I could also see a reason for treating it as a vulnerability since there's something preventing tactical intelligence from making it's way upstream. |
Copy of #391 (comment) by @funilrys Interesting ... I'll have to investigate this too ... @githubbot remind me. |
Copy of #391 (comment) by @g0d33p3rsec
Thanks! I wasn't sure what to make of it when I first noticed as I was also observing some challenges with scanning the sites which I interpreted at the time as anti-forensic attempts. There were a couple of domains where I had to get a particular user-agent and referrer and others where I seemed to encounter geofencing. Now I'm leaning more towards a bug somewhere between the domain addition and automatic validation on our end. If there's anything I can help with, feel free to reach out. |
Copy of #391 (comment) by @g0d33p3rsec
I'm wondering what endpoints it tries to test if it only has the domain to work with and no specific URIs. For most hosts, the root domain has been returning a default Apache/ Nginx page of the sort that comes with a fresh install. The only exception that I can think of offhand is the deface that was done to |
Copy of #391 (comment) by @spirillen
This is one of many reasons that PyFunceble by default leaves a record as This is of course not the Holy Grail for how this should be handled, but as there isn't enough human resources to maintain and cats any scumbag URI out there, we have to cut some corners, also the fact of RFC:954 is limited to FQDN, it don't make a hole lot of seance to keep a URI list for all those running on the ~60 years old hosts file system or even the never RPZ. The only places you really can use URI systems are in browse addons like Ublock Origin and proxy servers like Squid @/githubbot remind me |
Copy of #391 (comment) by @g0d33p3rsec
Oh, that makes total sense and for an interesting problem. I should only have a few more days of having to think in C++ before I can get back to thinking in Python and take a closer look at both projects. |
Copy of #391 (comment) by @spirillen
@g0d33p3rsec As I do think about the 404 uri's from your list above, I can't think of any current process that actually would remove them from the project. Reason: we treats a 404 as a temporary brake in something bad, as you can see here:
So is there by any change that your code could run a automated test for this, or do I have to write up something (which I sucks at) |
Copy of #391 (comment) by @g0d33p3rsec
I'll have to study the issue more in depth. I just did a public scan of the page with the mentioned 404, for reference and while the initial request returned a 404 response, there were also a stack of 200's from the site's host which would make it even more challenging to automate the removal of this sort of false positive. The mentioned requests/ responses can be seen at The false negatives are more of a concern as I'm still not seeing any of the domains from the merged commits make their way upstream. When the new records are being merged from this repo upstream, what checks are done to validate the current status? Does it just convert the domain to an http or https request and evaluate the response looking for any non 500 code? |
To my knowledge, none, As I keeps this one as clean as the time allows me. Which reminds me to set up a new test...
The |
By it I was referring to PyFunceble since that is what we were earlier discussing when speaking of the response codes. I'm trying to figure out what endpoint it will try to test for a response code and if that could be related to the issue. Once a commit is merged in this repo, what happens between then and when the Phishing Database is updated? I know when I made my first commits by URI and they were merged upstream, the results were almost immediately visible on VT and also propagated to other vendors. |
OT, but like to share this little meme from https://matrix.rocks/notes/9ssmc8s00z For your detailed question about:
Only @mitchellkrogza and maybe @funilrys knows
|
love it! All too true, unfortunately. |
I've noticed the activity group that I've been tracking has recently begun reusing previous hosts that should be protected by the list but the entries don't appear to be making it upstream from this repo to the Phishing Database. A couple of days ago, the group was observed reusing a domain that should have been protected against #381 (comment). Today, I noticed another reused domain, technowide[.]com[.]tr, which should have been blocked by #396 |
Do to a bug in Phishing.Database we are not able to do full search in the active files. For that reason we are now importing the `ALL-phishing-links.txt` and strips it down to domain only list in `data/phishing_database/` Related issues: - mitchellkrogza/Phishing.Database#840 - mitchellkrogza/Phishing.Database#881 - mitchellkrogza/phishing#381 (comment) - mitchellkrogza/phishing#396 - mitchellkrogza/phishing#407 - mitchellkrogza/phishing#395 - mypdns/matrix#624 - blocklistproject/Lists#1252 - mitchellkrogza/Phishing.Database#840 - mitchellkrogza/Phishing.Database#722 Trying to use @main for the php installer and using php version 8.4 Added `libdomain-publicsuffix-perl` to the dependencies.sh script as it is required by perl in import.sh. It turns out Perl just anoyingly does it again... 😏
Copy of #391 (comment) by @g0d33p3rsec
Thanks! I wonder if pyfunceble may be causing the false negatives when I add as domain or wildcard. When I first added by individual URI, Virus Total would return a positive once the commit was merged upstream. Since, as I've been adding as domain or wildcard, the sites seem to be dropped by the time this repo is merged upstream resulting in subsequent false negatives on VT from the Phishing Database even though the upstream repo showed recent merges. That's why I tried testing both a few commits ago but the results were inconclusive. I should have more time to dig into it after the semester ends next week. If you want to compare output, I've been trying to track the group using a VT collection which can be found at
https://www.virustotal.com/gui/collection/5b7e996c553034dddc8c690ea6be0adb3182b0fa96ce6a8b29627e165fb47f38/iocs
Here's an example from a recent add
https://www.virustotal.com/gui/url/0503dbd260648c364c10793657cdebe883da30554b3c9cbed639025ea45e58e7
Most of the detections shown are from hand feeding the domain to the individual EDR vendors, which can be a bit laborious.The text was updated successfully, but these errors were encountered: