Do not render contact form if it is requested directly by Spammers #420

adminBTI · 2021-06-15T07:12:05Z

Despite the anti-spam honeypot, I keep getting spam emails bothering me to renew my domain name or buy their "contact us" form spam services.

Legitimate users would access my website first and then go to the /contact page to submit the form. How can I display an error page instead of the "contact form" for those spammy clients who are hitting my /contact page directly?

I do not want to insert any filtering rules on the frontend webserver, because referrer can be spoofed and Codered sends csrftoken cookie only for form submission. Some commercial proxy servers even remove referrer header in http request.

Is it possible to use a different honeypot field for each client? Such as, different radio and input fields?

thenewguy · 2021-06-17T21:22:27Z

How would you detect these spammers? It would not be unexpected for someone to come to your "Contact Us" page directly from Google.

Perhaps you should look at adding a captcha message to your form if you want to make it more difficult for spammers? This one is very effective: https://www.google.com/recaptcha/about/

adminBTI · 2021-06-18T00:26:46Z

I exclude /contact page from search engine crawling! Also, in my /contact page, I only display the actual form, no other useful content.

Google's products are meant for its own employees, not for others. If you don't agree with me, then you are still young. I don't know about you, but I often cannot solve these Google Captchas! Is a tiny corner of traffic light box, which is not the actual light, still considered a traffic light? Is the box with just the tip of bicycle handle bar not to be counted? I don't know.

A lot of my website visitors are not from California. So, they would not agree to the jurisdiction of California (a requirement for using Google Captcha. Read their T&C and Privacy policies). A lot of websites are infected with dependent links to Google's fonts, scripts, captchas etc..

I use the following in my nginx.conf to block Contact form spam on a DjangoCMS site.. (It will work for Coderedcms once I figure out how to add language middleware):

# 1. Make sure that /en/contact/ is excluded in robots.txt
# 2. If LANGUAGE_COOKIE_NAME is not django_language (default), change accordingly
set $var "$uri$cookie_django_language";
if ($var = "/en/contact/") { return 404; }

thenewguy · 2021-06-18T12:38:53Z

It sounds like you've got it figured out. Best of luck 👍

onaralili · 2021-07-06T06:48:59Z

I exclude /contact page from search engine crawling! Also, in my /contact page, I only display the actual form, no other useful content.

Google's products are meant for its own employees, not for others. If you don't agree with me, then you are still young. I don't know about you, but I often cannot solve these Google Captchas! Is a tiny corner of traffic light box, which is not the actual light, still considered a traffic light? Is the box with just the tip of bicycle handle bar not to be counted? I don't know.

A lot of my website visitors are not from California. So, they would not agree to the jurisdiction of California (a requirement for using Google Captcha. Read their T&C and Privacy policies). A lot of websites are infected with dependent links to Google's fonts, scripts, captchas etc..

I use the following in my nginx.conf to block Contact form spam on a DjangoCMS site.. (It will work for Coderedcms once I figure out how to add language middleware):
# 1. Make sure that /en/contact/ is excluded in robots.txt
# 2. If LANGUAGE_COOKIE_NAME is not django_language (default), change accordingly
set $var "$uri$cookie_django_language";
if ($var = "/en/contact/") { return 404; }

Some bots simply visits a website and starts crawling instead of directly coming from a search engine. Also this won't prevent manual spam. As an alternative approach would be to integrate spam filtering API like OOPSpam which is GDPR compliant.

murty2 · 2021-07-09T10:22:34Z

Yes, I know some bots will access my website directly. Almost all such bots are upto no good anyway because those are looking to spam.

OOPSpam is a commercial solution and you may be trying to promote a commercial solution on this open source page.

Some bots simply visits a website and starts crawling instead of directly coming from a search engine. Also this won't prevent manual spam. As an alternative approach would be to integrate spam filtering API like OOPSpam which is GDPR compliant.

onaralili · 2021-07-12T10:30:11Z

Yes, I know some bots will access my website directly. Almost all such bots are upto no good anyway because those are looking to spam.

OOPSpam is a commercial solution and you may be trying to promote a commercial solution on this open source page.

Some bots simply visits a website and starts crawling instead of directly coming from a search engine. Also this won't prevent manual spam. As an alternative approach would be to integrate spam filtering API like OOPSpam which is GDPR compliant.

I was replying to @adminBTI comment.

It is true that OOPSpam is commercial and that is how it can offer to be privacy-friendly unlike privacy nightmare reCaptcha. Other anti-spam services like Akismet are commercial and they tend to be commercial to keep operation going.

If privacy non-issue for you and looking for free alternative reCaptcha or simple heuristic spam words check would work.

vsalvino · 2021-07-21T20:12:10Z

Great discussion; chiming in on the various suggestions in this thread:

I think the only possible "true" solution is to make the forms flexible enough to integrate with a commercial spam checker such as Google reCaptcha or some of the other products mentioned in this thread. We would probably be inclined to support Google out of the box, after entering an API key in the wagtail settings, and provide a hook for others to implement their own.

The honeypot method would remain the default as it is simple and free. I would like to improve it a bit, but without seeing spambot behavior it is difficult to know how they are getting through it. We could potentially implement a rate limiter to prevent a single IP from submitting the form X number of times per minute.

As for the suggestion about blocking direct hits to the URL with no referer, that is something we would never directly support, as it is a very common use case (e.g. sending a link to the form URL in an email). But if it works for your individual site, the nginx or django middleware methods referenced should be sufficient, without requiring any changes to coderedcms.

murty2 · 2021-07-21T20:42:26Z

Please consider implementing

Simple captcha https://github.com/mbi/django-simple-captcha
Two honeypot fields that change for each request. For example, one radio and another short text for one request and then two multi-select for another request

I am not sure rate limiting at the application that caches is a good idea. Webservers and firewalls are better for that.

A lot of Ubuntu kids seem to use Fail2ban but I am more inclined to use something like SSHguard to block or rate limit IPs. Even with ipset module, it does take 100MB or so of memory to block or rate limit IPs in firewall, but I can take this as a decent compromise when compared to full fledged WAF

I wrote a firewall level script that blacklists which stopped spam. https://github.com/murty2/blacklist But ideally, I would like to know how a form data can be checked by spamassasin or dspam filter process (similar to how email is checked before delivering)

The honeypot method would remain the default as it is simple and free. I would like to improve it a bit, but without seeing spambot behavior it is difficult to know how they are getting through it. We could potentially implement a rate limiter to prevent a single IP from submitting the form X number of times per minute.

vsalvino · 2021-08-10T13:57:30Z

Posting here for future reference: May have found a good open source captcha package we could integrate with: https://django-simple-captcha.readthedocs.io/en/latest/index.html

adminBTI added the Type: Enhancement New feature or functionality change label Jun 15, 2021

adminBTI changed the title ~~Do not render /contact if it is requested directly by spammers~~ Do not render contact form if it is requested directly by Spammers Jun 15, 2021

vsalvino added Needs Research Needs further research and discussion on implementation and removed Type: Enhancement New feature or functionality change labels Aug 2, 2021

pppls mentioned this issue Oct 2, 2021

Simple captcha #451

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not render contact form if it is requested directly by Spammers #420

Do not render contact form if it is requested directly by Spammers #420

adminBTI commented Jun 15, 2021 •

edited

Loading

thenewguy commented Jun 17, 2021

adminBTI commented Jun 18, 2021 •

edited

Loading

thenewguy commented Jun 18, 2021

onaralili commented Jul 6, 2021

murty2 commented Jul 9, 2021

onaralili commented Jul 12, 2021

vsalvino commented Jul 21, 2021

murty2 commented Jul 21, 2021

vsalvino commented Aug 10, 2021

Do not render contact form if it is requested directly by Spammers #420

Do not render contact form if it is requested directly by Spammers #420

Comments

adminBTI commented Jun 15, 2021 • edited Loading

thenewguy commented Jun 17, 2021

adminBTI commented Jun 18, 2021 • edited Loading

thenewguy commented Jun 18, 2021

onaralili commented Jul 6, 2021

murty2 commented Jul 9, 2021

onaralili commented Jul 12, 2021

vsalvino commented Jul 21, 2021

murty2 commented Jul 21, 2021

vsalvino commented Aug 10, 2021

adminBTI commented Jun 15, 2021 •

edited

Loading

adminBTI commented Jun 18, 2021 •

edited

Loading