Google’s search engine updates are always aimed at providing users the best results for their queries. Users have made it clear how disappointing it is to perform a search and have every result contain the same exact content, or content with very minor differences. Different URLs that contain the same or similar content, and are crawled and included in a search engine’s indices, have been deemed duplicate content. Google has long warned that this practice, whether intentional or not, will be penalized in terms of search visibility.
With all of Google’s updates, many site owners worry about being penalized for duplicate content. However, there is no definitive penalty, like having your website banned, unless the duplicate content is deemed “malicious,” or created with the intent to deceive and manipulate search engine results. However, duplicate content leaves your website vulnerable to such judgment, even if it has not been duplicated on purpose.
The biggest penalty for duplicate content continues to be search visibility (lower rankings), a major component of SEO. Duplicate content can cause a decreased page authority, a slower crawl rate, and an overall loss of visibility in search engine results. It causes all pages with instances of duplicate content to compete for important ranking metrics, such as authority and popularity, decreasing each page’s overall rank because these metrics are split between them. Recognizing and correcting instances of duplicate content will ensure that your pages have the greatest possible potential to rank in results.
Understand Where Duplicate Content Comes From
Most instances of duplicate content are not malicious. There are numerous ways to create it, but the important thing to remember is that it can be fixed.
Multiple URLs for the Same Content
Chances are that a website is powered by a database which is operated by a developer. In that database there is one article that can be retrieved through multiple URLs. This means that an article about Topic A shows up in the database as http://www.homepage.com/topic-a/ and as http://www.homepage.com/article-category/topic-a/. Each extension (the stuff that comes after the .com/) points to the same article within the database. To a developer, only one article exists. However, a search engine uses the root domain itself as a unique identifier of content, not the extension, causing these to be seen as two separate and distinct articles. So the different extensions added to the domain by the database cause duplicate content when they are crawled and not prevented from being indexed by a search engine.
Mobile- and Printer-Friendly Versions
Many websites offer printer friendly and mobile accessible versions of their content. If these websites are not prevented from being indexed when they are crawled, a search engine will see them as separate entities, creating duplicate content. Now the search engine has to choose which version of the site to display – the one that has ads and links and graphics – the one a site owner would want people to land on – or the most simplistic version containing only the content.
URL Parameters for Analytics
Keeping track of the source that sent a user to a site is important, especially for SEO. However, parameters used for tracking and sorting can create duplicate content if they are not implemented correctly. This also has to do with search engines viewing domain extensions as separate web pages. No matter what parameters are used, if they assign different extensions to different users and do not change the content, it will create duplicate pages.
Session IDs
For websites that need to keep user information as they browse through different pages, for instance, to store items in a wish list or shopping cart, each user needs to be assigned a session. Basically, a session is a short history of what a user did while on a website that is assigned a specific ID. Usually this is done using cookies, but occasionally these are not set up properly or at all, causing the site to append Session IDs to the URL by default. Each Session ID is unique, which makes each URL unique, thus resulting in duplicate content.
Never Fear!
These are only a few examples of the many ways duplicate content can unknowingly or unintentionally be created. The important thing to remember is that all is not lost. First, you must discover whether you even have duplicate content. If it turns out you do, stay calm and check out these two common suggestions for dealing with it.
301 Redirects
These are useful when you have several URLs that all lead to your homepage, (i.e.: homepage.com/home; home.homepage.com, www.homepage.com, etc.), in instances where you are combining two websites and want to ensure that old URLs will lead to the new page, or if you have switched your site to a new domain and want a seamless transition. The 301 status code indicates that the site has permanently moved to a new location – your one definitive page that should be credited with all ranking metrics. One page split between many URLs causes ranking metrics to be split between them as well, affecting their overall ranking and visibility. Using a 301 redirect strengthens a page, creating the potential to have one highly ranked page. When there is more than one URL for the same page, those pages all compete with one another to rank for results, and they don’t compete as well because their relevance is diluted due to the duplication. Linking them through a 301 redirect creates the potential for a stronger relevancy and popularity quotient, important ranking metrics, and a positive impact on ability to rank as well.
Rel=“canonical”
The word canonical stems from the Roman Catholic word canon, or the set of books that were created and accepted as genuine for inclusion in the New Testament. These books were viewed as the authoritative gospel. In SEO, a rel=“canonical” tag indicates to a search engine the authoritative version of several WebPages with similar content. This method is especially useful when you have several pages with similar content that you don’t want to get rid of. For example, you have a website that lists the same products but each page has them ranked alphabetically, by lowest or highest price, by user ratings, and so on. A rel=“canonical” tag has the same power as a 301 redirect, but is placed into the href part of the URL to indicate the canonical version of the page that should have all link and content metrics applied to it. A canonical tag is friendlier for website users because it does not redirect them to another site, all the necessary components of a 301 redirect are done behind the scenes.
But First Things First, Identify It!
Before you attempt 301 redirects or canonical tags, you must begin the difficult task of actually discovering the duplicate content, and it does not only refer to entire pages that are exactly the same or similar, but can be something as simple as having the same title. And if your efforts to create new and original content as part of your SEO campaign are going well, sifting through all of it to find duplicates can be tantamount to finding the proverbial needle in a haystack, only you have to find two or more needles that are exactly the same. Yikes! But lucky for you, Optimum 7 has recently updated its Duplicate Post Remover, a plugin for WordPress. It is a simple way to identify occurrences of duplicate titles on pages and posts. Click here to learn more about it and download it here.
If your duplicate content is giving you a headache or you’re disappointed by your website’s performance in generating leads, contact Optimum 7, a highly consultative internet marketing company that specializes in Search Engine Optimization (SEO), Pay per Click (PPC) Management, eCommerce, Web 2.0 Programming and Design, Social Media and Conversion Optimization, Reputation Management, and Affiliate and Email Marketing. Don’t let your duplicate content or any other internet marketing dilemma stop you from achieving your goals by getting in touch with us today!