Online search engines are meant to pick out high-quality sites amid the sea of knockoffs that repeat information produced elsewhere, but even they get overwhelmed. As recently as March, for example, the first 10 results from a Google search for “how to organize your desktop” contained nine links to pages churned out by “content farms” — Web sites that publish reams of articles that aim simply to attract clicks and advertising dollars.
That prompted New Scientist magazine to ask computer scientist Richard McCreadie at the University of Glasgow to look into the issue. The results show that Google and Microsoft’s Bing seem to be regaining the upper hand in the fight against content farms.
Most of the credit has been given to Google, which announced in February that it had updated its search algorithm in a bid to prioritize sites that publish original and well-researched material. It won’t provide details, but many site owners noticed that the update penalized sites that publish multiple, near identical articles, a favorite tactic of content farms.
To test how successful the new system is, McCreadie ran 50 search queries known to be a target of content farmers, such as “how to train for a marathon,” in March and then again in August. Then he paid people to examine the results for links to low-quality sites, where “low quality” was defined as uninformative sites whose primary function appears to be displaying ads.
The results are striking. In the case of the marathon query, sites that contained lists of generic tips, such as “invest in a good pair of running shoes,” were present in the top 10 in March but had disappeared by August, while high-quality sources, such as Runner’s World magazine, now appear near the top. Similar trends were found throughout the 50 queries.