Content Farms, Echo Sites, and the Future of Online Content
Investors today snapped up shares of Demand Media, whose IPO gave the company a valuation of close to $2 billion — more than that of the New York Times Co.
Demand Media is a prime example of a new kind of media entity, one made possible by the Internet and by the dominant use of a single search engine — Google — to access content on the web.
Demand Media’s properties include eHow, Cracked.com, Answerbag, and others. The company oversees what are typically characterized as “content farms” — sites which generate large volumes of quickly-produced, content-light material. This flood of content is tailored to the topics for which people frequently search (“driven by demand” as Demand Media’s tag line states). By paying low rates to create their content, optimizing for Google’s search engine, and generating many pages daily, sites in the Demand Media network have become among the most visited on the web. Furthermore, Google’s page rank algorithm often results in these pages being prominently featured in Google search results. By running Google’s AdSense ads on this rapidly expanding universe of content, the revenue rolls in.
Richard McManus at ReadWriteWeb has been covering this trend for some time. Over a year ago McManus railed against Google to “wake up and smell the coffee” and “find better ways to surface quality content.”
Earlier this week in a blog post titled “Google Search and Search Engine Spam,” Google’s Matt Cutts outlined the company’s plans to reduce the amount of “webspam” in Google’s search results.
Cutt’s readily admits that “As we’ve increased both our size and freshness in recent months, we’ve naturally indexed a lot of good content and some spam as well.” He goes on to say that Google has recently taken steps to “[make] it harder for spammy on-page content to rank highly.”
Beyond tackling traditional methods of gaming search engines such as detecting the misuse of keywords, content “screen-scraped” from other sites, and hacked sites, Cutts also mentions Google’s recent moves to control the rankings of content farms “with shallow or low-quality content.”
The money Google makes from AdSense placements — which are typically the only source of revenue for content farms — have caused some to question whether Google is truly motivated to eliminate such practices. Fortune reports that AdSense accounts for 30% of Google’s revenue.
Google’s Cutts firmly denies the assertion that Google actively assists these sites:
One misconception that we’ve seen in the last few weeks is the idea that Google doesn’t take as strong action on spammy content in our index if those sites are serving Google ads. To be crystal clear:
- Google absolutely takes action on sites that violate our quality guidelines regardless of whether they have ads powered by Google;
- Displaying Google ads does not help a site’s rankings in Google; and
- Buying Google ads does not increase a site’s rankings in Google’s search results.
These principles have always applied, but it’s important to affirm they still hold true.
While one can gripe about the quality of content in sites like Demand Media and other content farms like Answers.com, they do, at least, produce original content. Other, less reputable, sites merely grab content from elsewhere on the web to automatically generate pages to host Google’s AdSense ads.
Even a high-end site like the Huffington Post, which provides a valuable service by highlighting and summarizing interesting content from around the web, generates much of its revenue based on the original work of others. To the extent that, as the New York Times reported nearly two years ago, the “Huffington Post summary of a Washington Post or a CNN.com report may appear ahead of the original article” in Google’s search results, something is wrong with how the search engine prioritizes content.
While it may be difficult to algorithmically assess “authoritativeness” in the abstract sense, one would think Google could, at the very least, differentiate the primary source of content from sites that echo or abstract it.
Google’s recent emphasis on “freshness” has exacerbated the problem. In an attempt to address the rise of “real-time” social media such as Twitter and Facebook, Google’s page ranking algorithm apparently gives priority to how current a piece of content is. But in many cases the most recent item may not be the most authoritative. Worse yet, this plays directly into the hands of content farms and aggregators that pump out large volumes of specious content at a furious rate.
The conclusion of the 1973 film Soylent Green (spoiler alert for those who have never seen it) is disturbing not only for its allusion to institutionalized cannibalism but, as well, for the more abstract notion of a society feeding on itself — a society so devoid of an external source of nutrition that it has to consume itself in order to survive. If the economic incentives continue to favor those who churn out low-value content or re-purpose the material of others — rather than those who generate meaningful original content — the future of the web may be equally bleak.
Image from Soylent Green is a screenshot from a copyrighted film, the copyright for which is most likely owned by the studio which produced the film. It is believed that the use of a web-resolution screenshot for identification and critical commentary on the film and its contents qualifies as fair use under United States copyright law.