Maintaining high traffic websites have their own merits and demerits, the most annoying thing about them is SPAM. With lots of legit posts, you get lots of spams which directly affects the site's reputation and credibility. So how do we protect our site from spams ? I’m going to share my findings when I came accross various methods to prevent spams.
There is no 100% full proof way to prevent spam but there are techniques that can help us reduce spam by a good amount.
Following are some Drupal modules that help us prevent spam. I'll focus on the top 4 as they are the ones that are most widely used and have made their presence prominent in protecting from spam.
Captcha, reCaptcha:
A CAPTCHA is a challenge-response test most often placed within web forms to determine whether the user is human. Captcha is a good solution for the websites which do not have much traffic, but for high traffic websites I will not recommend this module as
1.The most basic problem with captcha is of cache. Let's assume you have a webform and this webform is to be shown on a lot of the pages on your site. With captcha, you cannot use Varnish based cache (or any other reverse proxy cache) as Varnish does not cache the pages that have a session cookie enabled on the page and captcha does exactly that - set a session based cookie. This will result in all these static anonymous pages to be rendered by Drupal which will result in a lot of PHP processes on the server. Even if you have Drupal cache and memcache enabled, all these pages could have been coming from Varnish and let you use your server resources on authenticated users. A lot of PHP processes usually result in chewing up server resources. Till date, besides APC (or eAccelerator or Xcache),the world has not found many solutions to making your PHP light weight so its always a bad idea to run PHP processes for anonymous requests :)
2.Another problem is with UX - it is very annoying for a user to enter captcha every time when they posts a comment or fill any form which is captcha protected. The design of spam protection should not be an overhead to legitimate user but on spammers.
3.Accessibility - Adding image CAPTCHAs to web sites makes them inaccessible to visually impaired people using screen readers.
Mollom:
Mollom is a third party spam protection service. It does a great job in protecting your site from spams. It analyses the content posted on your site and approves only when it is a legitimate post. It does so by analysing posted content, past activities and reputation of the poster. This module is even more effective when spams are post by human spammers (more details). Some concerns with using this module are :
1.Third party dependency - if the mollom service / server is down due to any reason then you can be on receiving end of the trouble.
2.Captcha Fallback - this module uses captcha as fallback method and therefore you will not be able to use Varnish (or any other reverse proxy) cache.
3.Pricing - it comes with a heavy pricing which is recurring.
Honeypot:
Honeypot is a very good, effective, simple module to combat against spams. This modules uses the spammers idea against them. A spam bot attacks a form, fills all the fields and then submit the form. Here is how honeypot module works (More details).
It has two methods:
Hidden Form Field: An invisible field is added to the form. If this field is filled then form returns an error. Human beings won’t ever see this field, so they won’t fill it out. Even if they do, the field is labelled in such a way as to indicate human should not fill it out.
Time difference: takes the difference of the form submission and form rendering, if the time difference is too low then this form is submitted by a spam bot else a legit post.
Some key features
- Simple to configure
-
More effective than other spam protection modules
-
drupal.org uses this
-
3.Does not use javascript for spam protection
4.This module is actively maintained and a large number of sites are reportedly using the same.
5.Reverse Proxy cache works (Reverse Proxy caching is disabled only if we use this in time range configuration mode)
This technique works great for bots / automated spammers but when it comes to human spammers, it cannot do anything. A third party service like Mollom can be very handy to tackle human spammers.
Some good reads on honeypot:
1.http://www.midwesternmac.com/blogs/jeff-geerling/introducing-honeypot-form-spam
2.https://www.lullabot.com/blog/article/module-monday-honeypot
3.https://www.drupal.org/node/1880300
4.http://www.midwesternmac.com/blogs/jeff-geerling/introducing-honeypot-form-spam
Botcha:
Advantages:
1.Contains a lot of recipes for spam protection
2.Contains all recipes used by honeypot module
3.Is evolving a lot over time - something definitely to keep an eye on
Disadvantages:
1.Disables the page cache (not suitable for high traffic websites) (http://2bits.com/articles/beware-drupal-modules-disable-page-cache.html#comment-1555, https://www.drupal.org/node/1922226)
2.Module is minimally maintained i.e. maintenance fixes only
3.Uses JS for spam protection
All in all I'd say Honeypot module is my choice for spam protection. I'd recommend all to give it a shot and if it still does not protect your website against spam and you have deep pockets, go for Mollom :-)
This is just me sharing my experience and understanding of spam protection on a Drupal site. Please feel free to add your comments and share your experiences so that we can all continue to find a better solution.