Comment spam is an annoying problem for most of the CMS administrators, whether it is a Drupal, Joomla or WordPress, they all share this vulnerability to some extent. This article is a part of the series we are posting about the issue, focusing this time on Drupal spam prevention.
Starting off with the most widely used spam-fighting module:
mollom-logoMollom is a free service for smaller websites that intelligently scans the content of all comments and contact forms for keywords that are usually associated with spam messaging. It then separates the content in three: “Ham” – the desirable content, “Spam” – you guessed it, the undesirable content and “Unsure” whenever it has doubts on it.The module uses a complex history of activity as a part of its algorithm, thus it makes a background check on each IP to see if it had previous spam history on any site a part of its network. This may raise some privacy issues in certain situations, as the module reads all content posted.
If Mollom is not sure whether the comment is spam (but only then), it applies CAPTCHA box to verify it is written by a human. This way most of your visitors will avoid deciphering the image and only suspicious comments will have to undergo the test.
We have all seen the computer generated, distorted text verifying that we are in fact human and not bots surfing the web. This is a simple, but an effective way to prevent spam bots posted comments. You can use it in many services, such as preventing email or comment spam, verifying submissions of online polls and many others.What makes the reCAPTCHA different? recaptcha-drupal-module Not much actually, it is in a readable form, unlike many other alternatives where you cannnot login in your own website until you try at least 10 times, it is familiar to most of the visitors AND… it contributes to humanity! Yes, the reCAPTCHA always displays two words, one of them to prove that you are a human and other is an unknown word from text documents that the software failed to digitalize correctly. This word is added to the database. The first success was digitalizing in full the New York Times only by people solving reCAPTCHA.
The bad side of it is that bots are constantly getting smarter, some of them bypassing the toughest CAPTCHAs there are and it will not be long before they will be better than humans at solving them. Also, it is fact that most of the people don’t like dealing with this type of verification.
There is a full-prove alternative, stopping all spam instantly…
- Require registration
This is the simplest way you can stop all spam comments, but it is far from practical. Requesting a registration before a user can post in general will result in no more comments of any type, including authentic ones. Unless you are managing a top community in a specific nice, most of us will not bother signing up an account just to be able to post a comment. I this case you stop comment spam AND the comments.
- The “Spam” module
Just “Spam” is the name of this module, incorporating several spam filter modules into one. All content will get rated on a scale from 1 to 99, with 99 meaning that the content is 99% likely to be spam and 1 meaning that there is only 1% chance to be such. Each of the filter modules inside “Spam” are rating the content independently and then the score is averaged before determining whether it should be accepted or not.The complexity of the module allows tracking spam in any language. Once it finds a spam comment, the module will auto-learn the website of the person and block any content linking back to them.
- The Hidden CAPTCHA
This is a very interesting approach that stops most of the bots. The module displays a captcha that is visible only to the bots, but no the humans. The goal here is to leave it blank in order to succeed. Most of the bots however are not smart enough to verify whether the content is actually visible or not, thus they fail by solving the captcha.