Spam Bots: From Tinder To SEO Webspam, What Can We Do?

What To Do About Spam Bots?
As spam bots annoyingly proliferate in the internet business world today, countering them using Recaptcha is the best move to prevent automated bots from completing processes that should only be carried out by humans. Although it can be annoying for users because it disrupts their flow and overall experience with your site, you need it to detect the bots that are coming to your website.

Share This Post!


Estimated Read Time: 6 minutes + 17 minute recommended video

Not everyone pays attention to Google’s Recaptcha panels, but you have all experienced them.

They don’t always look exactly the same, but they are consistent nonetheless.

In fact, seven days after I post this blog, all commenters will need to complete one.

Why Recaptchas Are Important

The primary goal of a Recaptcha is to prevent automated bots from completing processes that should only be completed by a human being.

Some of these tasks include things such as:

  • Posting Comments on Websites
  • Creating Social Media Profiles
  • Competing in Online Games, such as Poker
  • Filling Out Forms on Websites

The internet would be a far less hospitable place if bots were allowed to run rampant and take control of large portions of the web.

If you run a website, you know how annoying it is to see a comment on your new blog post, only to see it is actually a garbled blob of nonsense text linking to something ridiculous (such as get rich quick schemes, Viagra e-commerce stores and so on)

If you have ever used Facebook and kept being added by random strangers, it can get frustrating.

What’s worse, if you use dating apps such as Tinder, you will regularly come into contact with a spam bot that is trying to get you to visit another website. Many users have previously complained about this persistent problem.

Furthermore, if you enjoy a few rounds of online poker, you may be competing against a sophisticated AI that is much more capable of reading and calculating odds than you are.

Recaptchas play a vital role in reducing the spam bots abilities to carry out these tasks.

Note that this means better programmers are creating more advanced spam bots that can bypass more Recaptchas. It can become a bit of a never-ending cycle.

My Biggest Problem With Spam Bots

Working as an Inbound Marketer involves many tasks and processes.

Many of these tasks revolve around SEO.

Every SEO practitioner must decide for themselves if they are going to operate in a white hat manner, or a black hat manner.

Essentially, whether they are going to play by the rules or try to game the system.

Playing by the rules produces slower, more consistent and reliable results.

Gaming the system can produce fast, inconsistent and often short lived results.

You will often hear hordes of business owners testifying that SEO just cant be trusted as a long term marketing strategy, that the results could swing in the opposite direction at any moment.

More often than not, the results that suddenly turned against them were a byproduct of black hat SEO.

Spam Bots play a huge role in the black hat SEO world.

How Spam Bots Help Black Hat SEO

I’m going to break this down into two categories: the Unsophisticated Method and the Refined Method.

If you would like a more in depth look at the Black Hat methodology, Tim Soulo from Ahref’s recently put up a great post that explains how this shaded economy operates.

The unsophisticated method consists of using bots to perform tasks such as:

  • Create Links from Blog Comments
  • Create Links from Spam Directories
  • Maliciously Hack Sites to Place Links

The Google algorithms have grown more and more capable of identifying these practices, and therefore they are quickly caught out and rectified.

The more refined approach uses bots to manage and create Private Blog Networks, often referred to as PBN’s.

A private blog network is a series of websites that exist solely to pass authority to other websites.

Often they are built on expired domains that previously held authority.

As a domain expires, it is placed in a public auction; search engines exist that will then list off the available domains along with metrics such as:

  • how many links it has
  • how it is already ranking
  • trust metrics for the links it has

From there, the PBN owner can purchase an aged domain (which has benefits) that has existing links from other trustworthy domains.

These networks are extremely difficult for Google to detect because they are designed to fly under the radar.

This doesn’t mean Google is not trying.

How bots help a Black Hat PBN:

  • Updating and managing the websites
  • “spinning content” so that it evades detection

How Recaptcha Could Be Modified to Help Detect PBN’s

The use of spun content has no value whatsoever for a human being reading something.

When a person encounters spun content, they may think it is just extremely poor English, but more often than not, it is actually the result of an unsophisticated bot that is reproducing content while attempting to avoid detection by text matching software.”

One of the main reasons spun content does not work properly is that it produces unidiomatic phrases that a human would never use.

Languages employ idiom (the syntactical, grammatical, or structural form peculiar to a language) in a consistent way.

Essentially, the way we structure our sentences is not the only way we could logically construct them. The structure comes about naturally as part of a collective culture of acquired practice.

The Wikipedia example sums this up perfectly well:

“For example, although in English it is idiomatic (accepted as structurally correct) to say “cats are associated with agility”, other forms could have developed, such as “cats associate toward agility” or “cats are associated of agility”. Wiki Quote

Since bots are machines, they do not pick up on the nuances of languages (though they are getting better every day)

Therefore, if we used a reCAPTCHA form that required the user to select the correct idiomatic sentence structure of a language, it would be able to detect bots.

A while ago, I watched a very inspirational TED Talk.

The talk was by Luis von Ahn the creator of CAPTCHA forms, who is now working on an amazing project called DuoLingo.

You can watch his TED Talk below, running time just over 16 minutes:

The Origin of My Idea

As I discovered yet another competitor’s PBN this week, I was feeling in the dumps.

From the research I completed I could see that the options I had weren’t that great.

I could take a few of the following options:

  • Outwork them with white hat methodology, and trust the long game
  • Tattle on them, registering a webspam complaint
  • Accept that this is how it is done, and replicate it

I’ve always been a firm believer in white hat methodology, but it has always been frustrating to see black hat SEO reaping benefits and results.

I had never been more drawn to the dark side than this week.

I lay awake, contemplating the following:

  • Will I simply join them, and create a PBN
  • How would I do it
  • Would it work
  • What would be needed
  • Should I Take That Step

I tried to think of ways that Google could identify a PBN, and wondered what they are actually doing at the moment to already.

I ended up recalling this particular TED talk, and it occurred to me that a large scale PBN relies heavily on spun content in order for it to maintain profitability (because creating quality content is just not cost effective when you are running thousands of sites).

I started thinking about the idea of identifying spun content via a reCAPTCHA, essentially bringing millions of humans on board, one sentence at a time.

I would love to generate a conversation on this, so feel free to contribute to the comments section below, or share your thoughts with this hashtags: #pbn #recaptcha

Looking for an expert to setup your Repatcha on your business and increase your business security?