Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge config update Mythic have applied in prod #69

Closed
wants to merge 1 commit into from

Conversation

PeterJCLaw
Copy link
Member

Summary

It seems ClaudeBot doesn't obey robots.txt and makes a large number of requests.

Code review

Testing

  • applied the configuration locally
  • manually validated the new behaviour
ansible (block-claudebot)$ curl -Ik https://sr-proxy/ --header 'User-Agent: ClaudeBot'
HTTP/2 403 
server: nginx/1.18.0 (Ubuntu)
date: Wed, 21 Aug 2024 20:29:16 GMT
content-type: text/html
content-length: 162
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN

ansible (block-claudebot)$ curl -Ik https://sr-proxy/ 
HTTP/2 200 
server: nginx/1.18.0 (Ubuntu)
date: Wed, 21 Aug 2024 20:29:20 GMT
content-type: text/html; charset=utf-8
permissions-policy: interest-cohort=()
last-modified: Wed, 21 Aug 2024 00:22:45 GMT
access-control-allow-origin: *
etag: W/"66c53355-5189"
expires: Wed, 21 Aug 2024 00:36:24 GMT
cache-control: max-age=600
x-proxy-cache: MISS
x-github-request-id: BEDF:3FB0E3:221B046:22BC54A:66C5342B
accept-ranges: bytes
age: 0
via: 1.1 varnish
x-served-by: cache-lon4283-LON
x-cache: HIT
x-cache-hits: 0
x-timer: S1724272160.978336,VS0,VE85
vary: Accept-Encoding
x-fastly-request-id: 8aa3d5df87728424c7a24107941cb7627d764c61
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN

It seems ClaudeBot doesn't obey robots.txt and makes a large number
of requests.
@RealOrangeOne
Copy link
Member

I feel like there's a larger decision here as to whether we should be blocking LLM bots at all. Our website is static, and I doubt their scrape is causing that site much issue.

@PeterJCLaw
Copy link
Member Author

Fair point. I was doing this mostly to avoid wiping out a change which Mythic had already added in prod. Their email about it from early June (which I'd forgotten about) indicates that the issue was logs filling up the disk rather quickly. I'll go back to them and see if they think it's safe to remove.

@PeterJCLaw
Copy link
Member Author

Response from Mythic that they think it's probably fine to remove this config -- it was intended to be a temporary thing.

@PeterJCLaw PeterJCLaw closed this Aug 23, 2024
@PeterJCLaw PeterJCLaw deleted the block-claudebot branch August 23, 2024 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants