As SEO professionals, you know that having duplicate content hurts your rankings on the SERPs. Unfortunately, content scraping – more commonly known as stolen content – is a prevalent problem in the online world. Today, I want to answer one of the most frequent questions we get from our partners – can stolen content lead to Panda?
Scraping for Rankings
It all boils down to ranking. Primarily, the goal of scrapers is to drive more traffic to their site and rank fast. Stealing high quality content can be one of the easiest ways to rise to the top of search results. Published articles from top ranking pages and posts from popular bloggers often fall prey to these acts.
What makes this prevalent is how easy it is to scrape content. A simple matter of cutting and pasting is all it takes to copy your materials. More sophisticated scrapers buy special software that automates the stealing process. Another reason it is prevalent is because search engines have yet to develop an algorithm that will accurately distinguish original content from scraped ones.
The Panda Effect
When Google detects duplicate content, your rankings can drop significantly. This is because their algorithms view your content as spam. People searching for an article you already published may see the exact same post from another web page, causing you to lose significant site traffic. This could have a serious negative impact on your sales, especially if you depend heavily on organic search traffic.
Most scrapers are so good at what they do that they know what elements to remove from the contents they stole (such as name of the author/company). This can make it more difficult for search engines to go after them. With the vast amount of new information being shared in the internet every minute, Google cannot monitor every new article or content that gets published in real time. While Google has been aggressive in mitigating content scraping, the search giant ends up penalizing ALL websites with duplicate content on it. Google puts more weight in protecting its end-users from having to deal with identical content. It could take time before they detect that your content was the one posted first, and by then, the damage would have been done.
Protecting Your Site from Scrapers
There’s no surefire way to protect your web content from all these scrapers. What you can do is track if your content has already been stolen. Getting a phrase from one of your original articles online and searching for it in Google will let you know if someone has copied your content. But this is not the smartest way to hunt down scrapers.
Google Alerts is a great tool to help you determine whether your contents have been duplicated. Just type in the title of your post and you’ll start receiving an alert email when the same title shows up in Google search results. You may also use a service such as Copyscape or CopyGator. By entering the URL of your content using Copyscape’s free search, it allows you to track its duplicates on the web.
On Google’s end, they have since addressed the issue of Panda penalizing all websites with duplicate content. They sent out questionnaires asking users to define a quality website and report scrapers. They keep updating the Panda algorithm according to these questionnaires to help their algorithms mimic human browsing behavior as closely as possible.
Conclusion
When your content is stolen, it puts your site at risk of getting punished by Google, even if you’re the one who wrote it. Some scrapers can even trick Panda and make it appear as if they’re the original authors of the content they stole from you. The good news is, Google is fast improving its ways in identifying original content sources.
Under the Digital Millenium Copyright Act (DMCA), you can easily appeal your case and ask Google to deindex those pages which stole your written ideas. Google is reliable with regard to taking actions on DMCA requests. While the entire scraper’s site may not be deindexed, what matters is the duplicate content gets removed from the search results so you start earning back traffic – and potential leads – to your site.