Duplicate content is a high-level SEO topic. Site owners dread at the mention of it.
Ask anyone about duplicate content, including some of the self-proclaimed marketing gurus, and most of them will have you believe your website or blog is a veritable time bomb courtesy of duplicate content. It’s just a matter of time before you’re slapped with a Google penalty.
But how much of this statement is true?
Very little – while duplicate content can still affect SEO, it most certainly won’t get your site penalised, except for a few extreme cases.
What’s Duplicate Content?
Duplicate content is the term used to describe similar content in multiple locations or URLs on the web. It’s confusing because search engines wouldn’t know which of the URL to display.
This might hurt how a page is ranked. Worse is when other websites start linking to the different versions of the content, confusing search engines even further.
Let’s Illustrate this Using a Real-life Example
Duplicate content compares to a crossroad with different road signs pointing to different directions, but all leading to the same destination.
Which direction should you follow?
Worse, the destinations are different too, but only slightly so. As a reader, this isn’t much of an issue because it’s the same content. But search engines have to choose which one to display in the search engine result pages and make sure that they do not show the same piece of content twice.
Will Google Penalise You for Duplicate Content?
Duplicate content isn’t the same as copied content. While Google will not hesitate to penalise you for copied content, they will not penalise you for having duplicate content on your site.
While copied content is deliberate, duplicate content is mostly caused by technical faults.
Google is clear on this. They won’t penalise your site for duplicate content. But in extreme cases, where hundreds of your pages have been duplicated, then you’re hanging on a delicate thread.
Google always ranks websites with original, high-quality content. Suppose you try to manipulate other people’s content and republish it on your site, spinning around sentences and splashing a few keywords here and there. In that case, Google will assume that you’re trying to game their system and, as such, drop your rank in their result pages.
How Much Duplicate Content is Acceptable?
The best thing is to have no duplicate content at all.
Always strive to publish original content. And if you must post duplicate content, then the least you could do is to be smart about it.
Go through your content sentence by sentence and reword everything. Overall, you’re safe with only 10% duplicate content.
Here’s what the former head of search quality at Google had to say about duplicate content, Matt Cutts:
According to Matt, the web consists of 25% to 30% of duplicate content. He even went on to add that Google never considers duplicate content as spam. They will never penalise your site for it unless it turns out you’re using it to manipulate search engines for a higher ranking.
Can a Duplicated Article Outrank the Original?
Yes, it can.
But in rare cases, and only when the website duplicating your content has a higher authority.
Must You Block Google from Indexing Duplicate Content?
There’s no need for this.
Google also has an interesting post on this (on how to handle the identical posts on your site).
Google advises you against blocking identical or similar content on your site, whether it’s by robots.txt or any other method.
Does Google Penalise Sites for Syndicating Content?
No. Google has made it clear that it does penalise sites that syndicate content.
Here’s what Google has to say about syndicating your content to other sites (paraphrased statement):
“Be careful with how you syndicate your content. Google will analyse all the versions of the content, and choose the one they think is most valuable to the user. You’re reminded that the version they choose to serve the user might not be the version you preferred.
You may also want to make sure the content you syndicate links back to the original content on your site.
Better, ask those syndicating your materials to use the no index metatags so that no search engine ends up indexing syndicated versions of your content (Google 2020).”
The Problem with Syndicating Your Content
The problem with content syndication is that you’ll never be sure if that will ultimately affect your organic traffic.
Since the content is on other people’s sites, that means they’re the one benefiting from all the positive SEO signals the content is generating – not you.
Asking them to link back to your original post might help. But let’s not forget that those links might be considered unnatural.
So, What’s the Best Way Forward?
Ask the sites that republished your articles to add a “rel=canonical” that points the articles to the original article. That way, your site gets to enjoy all the SEO benefits generated by the articles republished.
Can Google Penalise You for Thin Content?
Yes. Google will penalise you for thin content.
Here’s what they have to say about thin content (rephrased):
“Be careful with publishing stubs. Users don’t like it when your site has so many empty pages. You want to avoid using placeholders where possible.
You’re not to publish a page for which you have no real content. And if you must create a placeholder page, then be sure to use the noindex metatag to stop Google from indexing it.”
Types of Duplicate Content
No two website has the same set of characteristics. In other words, not every website is bound to experience the same content duplicate issues.
A static website is small and only has a limited number of pages. On the other hand, a CMS website has a lot of customised and autopilot features that might trigger content duplicate.
You might also be dealing with a more prominent site with millions of pages and dozens.
In this section of the post, we’ll try to break down different types of duplicate content and group them accordingly:
Internal Technical Duplicate Content Issues
This is where you have the same content appearing in multiple URLs on your website.
It includes issues like:
- Duplicated homepage at html.php and index.html
- Flash microsites, orphaned or broken
- Same content duplicated across multiple pages on your site
- Excessive reuse of snippets or content in a paginated series
- Faceted navigation
- Analytics tracking parameter on your internal page links
- Session ID parameters
- Duplicate content triggered by inbound links
- Inconsistent URL
- Inconsistent use of trailing slashes
- Numerous similar articles
- Thin content or no content on some of the pages
- Repetitive boilerplate snippets
Duplicate Content that’s Very Specific to the Types of Website
These issues are specific to a particular type of websites, especially ecommerce ones.
- Reuse of review copy
- Duplicated titles and meta descriptions
- Blank category pages
- Product copy distributed across marketplaces and affiliate sites such as Amazon or eBay
- Repeated content in the tabbed sections of your product pages, delivery terms or terms and conditions
Hosting-related Duplicate Content
These issues are mostly caused by server misconfiguration.
Examples include:
- No http to HTTPS redirections. Your site can be accessed on both protocols
- Site available on both non-www and www
- Indexed staging site
- Indexed load balancer on alternative subdomains, e.g., the IP Address or www3
External 3rd Party Duplicate Content
This occurs when a third-party website copies part of your blog or content.
Examples include:
- Lazy syndication of a press release that was initially posted in the news section of your website
- Scrapers republishing your posts through your RSS feed
- Sites that directly copy your posts and publish them on their blogs and websites
- Sites that rewrite your content verbatim and pass it as their own
Own External Duplicate Content
This is where you duplicate your content on other sites or blogs.
- Similar versions or copies of your content on the other sites or blogs you own
- The separate mobile version of your website, without rel-alternate declaration or canonical header in the header of the primary site
- Official syndicators
- Misconfigured geo-IP detection
- International domains, subfolders, or subdomains, without href=” lang.”
Most SEO experts flinch at the mention of “duplicate content penalty.” Online marketers who have little or no SEO experience love using this term even though most of them are unaware of Google guidelines on duplicate content. They assume that if an article or just a paragraph appears twice online, Google penalties must be close behind.
Today, we will debunk three common myths about duplicate content that has over been misleading people for years now.
Myth 1# Unoriginal Content on Your Site will Compromise your Rankings Across your Domain
Ever since I started offering SEO services, I’m yet to see real evidence that non-original content affects site ranking except for one extreme case. In this case, a new website was launched, and one of the personnel at the contracted public relations company copy-pasted the home page text into a press release and distributed it to thousands of platforms thereby creating hundreds of versions of the original page. This move caught the attention of Google who manually blacklisted the domain.
It was ugly since we were the web development company that had been hired to develop the site. We were blamed for the misfortune, but luckily the domain was re-indexed after we filed a reconsideration request and explained the situation to Google moderators.
Based on this example, there are three points to note:
- Volume: There were thousands of the same texts on the web
- Timing: All the content was duplicated and published online at the same time
- Context: The content was for a home page of a brand new domain
But this is not what people mean when they use the phrase “duplicate content.” A 1000 words article on a page of a well-established site is not enough to trigger Google to blacklist the site. Most of the sites, including the authority blogs, periodically repost articles that were first published on other sites. Sure, they do not expect the content to rank, but they also know that it will not adversely affect the credibility of the domain.
Myth 2# Scrappers Will Compromise your Site
One of my friends who is blogger is very keen on making sure that he does not violate Google Webmaster Tools. Whenever a scraper site copies one of his blog posts, he quickly disavows any links to his site to avoid hurting the credibility of his domain. He is yet to read Google’s guidelines for disavows and duplicate content.
In the past, I have checked the analytics of several major blogs, and surprisingly, their content gets scraped multiple times per day. The thought that they have a full-time employee whose role is to watch GWT and disavow links is outrageous. They know that duplicate content will not affect their credibility.
The bottom line is, scrappers will not help or hurt your domain or brand name. Most of the scrapers copy-paste the entire article together with the links. Even though the links in the scrapped version of the article will not pass authority to your site, you may get occasional referrals.
However, if the actions of the scraper outrank your site, you need to report the case to Google. Submit the complaint using their Scrapper Report Tool.
Digitally signing your content using Google Authorship will help the search engine to know that you are the original owner of the content. No matter the number of times an article that is scrapped, it will still be linked back to you if you signed it.
It is also important to note that there is a difference between copyright infringement and scraped content. Someone might decide to copy your entire site content and claim it to be their own creation.
Plagiarism is the practice of using someone’s work and passing it off as your own. Scrapers will rarely do that, but some could decide to sign their name on your content. That’s illegal and is the main reason why you need to have a copyright symbol in your footer.
Myth 3# Republishing Your Guest Posts on Your Site Will Hurt its Ranking
I write hundreds of guest posts per month, it is highly unlikely that my audience see all these posts. So, I often republish the posts on my blog to get as much readership as possible. Personally, I make sure that the content is 100% original, not because of fear of a penalty, but the desire to consistently offer value to my users.
Have you ever written an article for an authority blog? I have, and they usually request me to republish the post on my site a few weeks after it’s published. Some could even ask you to incorporate a small HTML tag to the post “rel=“ canonical” Tag.
Canonical is a term that is used to mean the “official version.” When you republish an article that was posted on other sites, you can inform search engines of the particular site where the article was originally posted by using a canonical tag.
Apply the Evil Twin Tactic
If the original article that you are considering to republish is a “how-to” post, you can change it into a “how not to” post. Base the contents on the original research and concept but makes sure that you use different examples and offer more value to the readers. The “evil twin” will look similar to the first one, but it will still be original.
Duplicate content is one of the issues that SEOs and Singapore webmasters have to deal with on a daily basis. Over the years, Google and other search engines have put in place stringent rules to prevent this vice. Sure, your site current ranking and credibility can be lowered if a section of the content on your website is not unique. However, there are specific steps that you can take to prevent such a scenario.
However, before we look at these tips, it is important to note that Google has in the past stated that duplicate content on a site does not attract a penalty unless it appears that the intent of publishing the material was to manipulate search engine results.
There are three categories of duplicate content namely:
- Exact duplicate: Two URLs with identical content
- Near-duplicates: Content that has small differentiators.
- Cross-domain duplicates: Multiple domains that have exact match or deal duplicate content
Consequences of Duplicate Content
1# Wasted Crawls
Search bot lands on your website with a crawl budget. This means that if you have duplicate content, you will waste the bots crawl budget and only a few of your essential pages that do not have duplicate content will be crawled and indexed.
2# Wasted Link Equity
It is possible for pages with duplicate content to gain link authority and PageRank. However, Google will not rank the content, and so you will waste the link authority from such pages.
3# Wrong Listing in Search Engine Results Pages
No one knows how search algorithms work. If you have pages with exact match or near duplicate information, you have no control over which pages are ranked or filtered out. Therefore, the pages that you want to rank may be suppressed by the other less relevant pages.
2021 Duplicate Content Guide: 7 Common Issues and How to Resolve Them Amicably
Wasted crawl budgets, falling rankings, and poor user- experiences are common problems associated with duplicate content. If you have been involved in content marketing for a while now, you have probably been advised to avoid reusing text on different website pages. While that is a plausible way of jumping this hurdle, the reality is that the process is more complicated than it seems at the surface level.
As mentioned earlier, there are many myths and misconceptions about duplicate content that you should be aware of to avoid making decisions based on the wrong information. For starters, Google defines duplicate content as substantive blocks of text within or across websites that either partially or entirely match.
However, copy-pasted meta descriptions and titles also fall under duplicate content. Pin-pointing these content pieces can be an uphill task if you don’t have the right tools. Already, we have discussed the top 5 best duplicate content checkers in Singapore in this article. Please scroll up to find out more about the plagiarism checking tools we recommend.
The beauty of the online marketing landscape is it’s dynamic. No matter how well you think you have prepared for a campaign, there is always the risk of an unforeseen development throwing you off balance. So, today we will discuss how to find duplicate content issues on your website and off-site. Then, we will proceed and look at eight plausible ways of fixing the problems.
How to Locate Duplicate Content
The first thing you should note is that duplicate content is easy to spot with the naked eye, but sometimes it is concealed in the very code that holds the website assets together. Based on this fact, it is recommendable to use plagiarism or duplicate checker to find it.
Spotting On-Site Duplicate Content
On-site content refers to everything that is posted on your website pages, including the dedicated blog page. In 2021, we recommend using Alexa’s SEO Audit Tool to locate the duplicate content that is preventing your website from spring up to the top of search engine results pages.
The beauty of using Alexa is that it can find hundreds of different URLs containing the same content. Apart from listing the URLs, it will give you recommendations on what you should do to fix the problem.
Photo Credit: Alexa.com
Not even the tiniest forms of content, such as the meta descriptions, will be spared. The list of URLs is exportable – so you can share it with your SEO team.
Photo Credit: Alexa.com
One of the benefits of fixing all the technical errors highlighted by Alexa is that the action will supercharge your meta-tag SEO. This improvement will, in turn, increase click-through-rate (CTR) and rest assured that search engines will notice the change and reward you accordingly.
Spotting Off-Site Duplicate Content
Writing content that is 100% unique is one of the guaranteed ways of ensuring that your website does not contain content similar to other websites. You can achieve this goal is by running all content through a plagiarism checker such as Copyscape before publishing.
This quality check step is of paramount importance, mostly if you have outsourced the writing tasks to a content creation company or a remote content writer. Contact the writer or company if you notice some of the articles have plagiarised sections.
Now that you know how to identify on-site and off-site duplicate content, let us shift gears and look at seven ways of fix the issues associated with duplicate content in 2021 and beyond.
7 Sure-Fire Ways of Fixing Duplicate Content Issues Without Hurting Your Website Ranking
Like other content marketing strategies, there is no one-size-fits-all solution to duplicate content. With Google becoming more adamant about plagiarism and changing algorithms to offer the best user experience, you cannot afford to bury your head in the sun.
Here are seven tested and proven ways of fixing duplicate content issues without hurting your website ranking or compromising users-experience.
1: Pagination
Even though Google and other search engines are proactively improving their bots to interpret website pages better, they are still not able to recognise paginated pages. Instead, they crawl and perceive them as a duplicate content.
A good example is gallery pagination, where each picture in the gallery has a dedicated page. Category pagination also occurs when product lists are on different pages on the same online store website. The bots will automatically view the paginated pages as duplicate and penalise you.
Photo Credit: Alexa.com
Solution:
The pagination issue is resolved by inputting the tags rel=” next” and rel=” prev”. The two tags will inform the crawlers that the two pages are related by showing the relationship between the different components’ URLs of the pagination pages.
Even as you implement this recommendation, keep in mind that, Google announced that the tags don’t impact site crawlability in March 2019. According to the announcement, most users prefer single-page content.
Nonetheless, a significant number of people prefer content that is organised this way, so go ahead and include the tags if you have such content on your website. Multi-part content pages are also acceptable for Google search.
2: Multiple Versions of the Same Pages
If your online business does not only target Singaporeans, the chances are that you have country-specific domains. The pages in different domains have similar content since the target audience’s needs are the same; only their geographical location varies.
How do you prevent search engines from marking the website content as duplicate?
Solution 1
Ensure that each domain URL’s structure sends out a clear signal to the search engine crawlers that the content is meant for multiple audiences in different geographical locations. For example, http://www.forexample.sg, it is clear that the target audience is based in Singapore. If the same company offers services in another country such as Denmark, the URL should be changed to http://www.forexample.de.
Solution 2
Insert hreflang tag to point users to the right website. The tag will ensure that bots guide the users to the correct website based on their geographical location. For example, if you target clients living in Spain, add this tag to the <head> section of your website.
<link rel=“alternate” href=“http://example.com” hreflang=“en-es” />
The hreflang tag will ensure that crawlers don’t interpret the translated version of the content on the website, and therefore it will not be marked as a duplicate content.
3: Syndicated Content
Content syndication is one of the superb ways of increasing the amount of referral traffic to your website. This strategy is also useful in getting high-value backlinks from websites ranked at the top of search engine results pages and considered an authority by Google.
If you decide to apply this tactic, make sure that you inform the crawlers that the article shared with the authority website does not qualify to be categorized as duplicate content. If you don’t, your website will be removed from its current rank on search engine results page and replaced with the high-ranking partner website.
Solution
Before sending out an article to the other website for publishing, add the rel=canonical tag in the <head> element. This tag should be added to every URL that has your content.
4: 100% Copied Content
The internet is filled with malicious websites that are out to mislead genuine customers. While there is no telling when you will become a victim of such sites, you should carry out random plagiarism checks to identify the sites before disaster strikes. The sites will not only mislead potential clients but also put your site’s authority at risk.
Solution
Contact the websites and request them to pull down the content. If the webmasters don’t oblige, go ahead and report them to Google for copyright infringement. Here is an article that comprehensively discusses how to report such websites as per Google’s Digital Millennium Copyright Act.
5: Create Printer-Friendly Website Pages
As a Singapore online business owner, your primary goal should be to offer the best user experience to everyone who visits your website. A high bounce rate is one of the signals Google uses to determine the relevance of content posted on a particular website.
Poor user-experience is one of the leading causes of a high bounce rate. You can avoid it by making sure that your website pages are printer-friendly. Even though such pages will enhance your business by spurring engagement and eventually increasing conversion rate, they can result in duplicate content issues.
Printer-friendly URLs result in two or more distinct versions of one web page. Therefore, if the content on both pages is indexable, the search engine crawlers will exhaust your crawl budget on both. The result is a drop in rankings for both or one of the sites that is considered inferior.
Solution
The canonical tag will come in handy to ensure that the printer-friendly website pages and the mobile page versions of the same website are not considered duplicates. Using the canonical tag, the crawlers will know which is the main web page, and all ranking signals will be derived from it.
One simple way of setting up a rel=canonical URL is by inserting a chunk of code in the <head> section of the web page you want to be the dedicated canonical. Replace the URL on the site with the original article URL.
<link rel=” canonical” href=” originalcontenturl.com”>
6: Subdomain Issues
Ideally, shifting your website to HTTPS from HTTP should enhance its ranking in SERPs positively. This is because Google considers HTTPs a positive ranking factor, and websites that use it rank better than their counterparts.
Unknown to most webmasters is that the change can set the duplicate content alarm off as some crawlers will see the two versions of your website as different.
Photo Credit: Alexa.com
The same challenges arise when one website has two similar URLs, is http://www.prefix and https://prefix.com. The search engine bots will split the link equity when crawling the two websites even though they are identical and belong to the same brand.
Solution
The recommended solution to this kind of issues is logging in to Google Search Console and specifying the dominant or preferred domain URL. This simple improvement will help the bots know which URL they should focus on when crawling the websites. Do this by accessing the Site Settings > Preferred Domain Section.
7: “Boilerplate” Content
What is boilerplate content? This is one of the most common questions that we get from our readers. Simply put, this is the content that is repeatedly published on different domains for no heinous reason. The webmasters may decide to use the same content on multiple domains to save on time.
Another example is when content on a Singapore eCommerce website meant to sell products is used by retailers who are in partnership with the website for affiliate marketing.
Solution
The retailers should develop unique product descriptions instead of copy-pasting the content posted on the main eCommerce store to their affiliate marketing website. Sure, this may take hours or even days, depending on the number of products and their specifications, but it’s worth it.
More importantly, if your blog or other sections of your website contain boilerplate content, you should consider changing it and adding more content. You don’t have to change your schedule to write the content; you can hire a copywriter to do it for you.
When to Worry About Duplicate Content
If the duplicate content on your site or blog isn’t malicious, then you have absolutely nothing to worry about.
Let’s hear it straight from the horse’s mouth, from the mother of all search engines herself – Google.
Here’s what Google has to say:
“Duplicate content on a website is only grounds for action on the site of its intention is to manipulate and deceive search engine results. When Google runs across duplicate content, they’ll rank the most authoritative version of the content.”
Google knows how to handle duplicate content. The search engine does a commendable job in sorting out duplicate content.
Still, it’s always best to manually get involved instead of waiting for search engines to sort out the issue.
Why Make Duplicate Content Such a Bad Thing?
Duplicate content won’t get your site penalised, save for a few extreme cases, as we said. But that’s not to say it won’t hamper your SEO effort.
Again, as we said, duplicate content confuses search engines. How? Because search engines can’t decide which page is most relevant to what’s queried.
Search engines are programmed never to display the same piece of content twice. The user wants to be served with options, and it’s the job of search engines to make sure that they do not see the same search result twice.
Duplicate content dilutes your authority, especially when different websites start linking to different versions of your website content.
Causes of Duplicate Content
Not all cases of duplicate content are deliberate. Most cases of them occur by accident.
One deliberate example of duplicate content is when you create a print version of a webpage. The print version is still on the same page with the same content, and when it gets indexed, it creates a duplicate of itself..
That’s one example of how people deliberately create duplicate content. But there are a few other situations when it’s created unintentionally.
Here are five different causes of duplicate content:
Session IDs
A session ID is a string of randomly generated numbers that web servers assign website visitors to track their site activities. They can be found in shopping carts.
Here’s how one looks:
https://www.abcdef.com/?sessionid=4644432367
The problem with the session IDs created is that they result in hundreds or thousands of duplicates.
How to Resolve Them?
The best way to resolve this problem is by storing your session IDs in cookies. But take your time to read about EU laws on cookies.
Sorting Options
Sorting options aren’t limited to product catalogues only, where buyers can sort products based on price, type, date, and so on. The sorting function is common with almost any kind of website, including a simple blog.
Usually, it looks something like:
https://www.abcedef.com/category?sort=asc
The URL with this sorting option is exactly the same as the original page. It carries the same content, only that it’s sorted differently.
Affiliate Codes
Affiliate codes are all over the place. Web owners use them to identify individual referrers and reward them every time they bring in a new customer or visitor.
An affiliate code looks something like this:
https://www.abcdef.com/product?ref=name
Once again, the codes create a replica or duplicate of their original page, which end up affecting your SEO effort.
Domains
Domains are another culprit in this.
When not handled with care, they can prove problematic.
Here are two types of domains to look at:
https://www.abcdef.com
https://abcdef.com
Search engines have advanced a great deal. But for some reason, they still find this confusing.
Both URLs lead to the same page (homepage), but since they both look different, they’re sometimes interpreted as two different pages.
Long Comments
When many users comment on your posts, some of the comments may appear on the next page. The created pages will show the same content, but with a different comment page.
Geotargeting Users with Different Content
Assuming your website targets users from the US, Australia, and the UK. The content you create for each region will be the same but localised to target the different groups of users.
Image Pages
It’s not uncommon to come across a website where each image has its own webpage.
Of course, you can still access the images on your content page, but upon clicking on them, they’re enlarged and open in their separate pages, thus creating duplicate content.
Taxonomy
This is common with content management systems.
When you assign a post to more than one category, they’ll create duplicate content.
Unless you choose a primary page, category pages will be marked as duplicates.
Copied Content and Duplicate Content
Duplicate content can also occur when you copy content from another page and publish it elsewhere. This is the textbook definition of duplicate content, but it doesn’t have to be that direct.
Here are a few ways people copy content without knowing they’re creating duplicate content:
When Creating Dedicated Landing Pages for Paid Searches: When creating a dedicated landing page for paid searches, most of the time, you’ll be creating a page that’s almost similar to the original page. Most people only tweak words to accommodate specific keywords, without doing much to make the content unique.
Other Websites and Pull Content Off Your Website or Blog: Unfortunately, immediately you hit the publish button, other websites and blogs will pull the information you share and post it on their own blogs or websites. The problem comes when the website that does this has a higher domain authority than you. They rank better than you, thus giving search engines even more reasons to consider their version of the post over yours.
Using Content from Another Website: Any attempt to copy someone else’s content will hamper your ranking but also taint the relationship you have with other bloggers and web owners.
How to Proactively Address the Duplicate Content Issue?
Luckily for you, there are a lot of ways to optimise duplicate content. Here’s what you can do:
-
- Delete them: Weigh on the point of having duplicate content on your site in the first place. Of what use is the content? If none, then don’t hesitate to delete it.
- Update duplicate content: If the content has to be there, you can always rewrite it. Replace the content with something original.
- Redirect the content: Instead of having two similar content on two different pages, why not redirect one of the pages to the other? That way, only one of the links end up containing the content, whereas the rest of the links redirect to it.
- Use the canonical link element to specify content authority.
- Use 301 Redirects: After restructuring your website, use 301 redirect to redirect users, search engine bots, and other spiders.
In Apache, this can quickly be done with a .htaccess file. In IIS, you can easily edit this via the administrative console.
- Be Consistent with Your Internal Linking: Decide on the type of linking you intend to use, and stick with it all through.
In which case, you’re not to link to the following type of links:
-
- https://www.abcdedf.com/page
- https://www.example.com/page/
- https://www.example.com/page/index.html
-
- Use country-specific TLDs When Serving Geotargeted Content:
When serving geotargeting content, you want to use country-specific top-level domains to serve the content.
-
- Use https://www.abcdef.de for Germany
- https://www.abcdef.au for Australia
- https://www.abcdef.us for the US
and so on
Minimise Boilerplate Repetition: Instead of writing a very long boilerplate text and copying it on every page, write a summary that you can link to a dedicated page with more details.
You also want to use Google’s Parameter tool to specify how you want their search engine to treat your URL parameters.
How Can You Avoid Duplicate Content
https://youtu.be/I_Ts_k8U-ow
Use Robot.txt File Block
Robot.txt file can help you to block pages that have duplicate content from being crawled. Google, however, does not recommend this approach because if the engine is unable to crawl such pages, it will not tell if the URLs are directed to the same content and will, therefore, have no option but to treat them as unique and separate pages.
Use 301 Redirects
If you are planning to get rid of duplicate content from your site, 301 redirects is an ideal approach for you. If some of the pages have received links, redirecting them to the correct URL will ensure that you still profit from the links. This move will help the search bots to know where to find proper content.
Use rel=”canonical” Link Element
Rel=”canonical” link element will help search bots to know which version of the content is true or original. All you need to do is add the link to the header of the duplicate article or page. For example
<link rel=” canonical” href=“https:mytruecontent.com”>
Should you worry?
From MediaOne’s experience – Google does not penalise duplicate content. Here are some reasons why:
- if you are a subsidiary or reseller, you may not be allowed by your principal to vary the specifications and description of your product or service – penalising you would be grossly unfair
- if you have an e-commerce website – its virtually impossible to vary your content because you have thousands of products from hundreds of suppliers – Google understands the e-commerce sites have this innate business challenge
- if you have multiple branches in different countries – you really wouldn’t want to say things too differently in different countries if you can help it as it can cause significant corporate branding and product/service variations
So what Google does is not penalise but it will not award points either.
Then How Do I Get Around This Issue?
Therefore in order to rank, you will need to vary your content if you are allowed to; OR add more content to reduce the duplication. Here is an illustration on how this is done:
Try to remember the real core reason why Google will want to put you on 1st page. Its because it thinks that what you are saying is USEFUL + ORIGINAL. “Useful” as Google wants to be the Oracle Of Everything so you will learn to depend on it from the moment you wake to the moment you close your eyes. “Original” because if you are simply copying or rehashing what others are saying – why should you be entitled to 1st page is where Google wants to put the BEST answers. To think in another way: if everybody is called “Simon” why should “Simon #9” be promoted above all the other Simon’s?
5 Duplicate Content Checkers
To avoid possible SEO curses triggered by duplicate content, you’re advised to take a few precautionary measures across all websites and within your websites.
There are a few duplicate checkers to help you out with this:
- Copyscape: Is a premium, paid plagiarism checker that lets you identify which part of your website content is similar to the other blog articles on the web. It’s efficient and fast, and can quickly point out any duplicate content and even provide an exact percentage of how much of your content is already floating on the web.
- Grammarly: Grammarly is a free plagiarism checker, designed to detect punctuation mistakes, word choice, spelling, and poor grammar. Their premium account provides some critical suggestions on how to improve your writing style best. Other than that, it lets you check for plagiarism from billion of websites on the internet.
- Duplichecker: Duplichecker will check your article for originality. It’s a free account that allows you to run up to 50 searches per day once registered.
- Siteliner: Siteliner allows you to run monthly check-ups for duplicates or plagiarised content on your site. Other than that, the tool can also help you to identify broken links and which part of your website isn’t performing well in the SERPs.
Small SEO Tools: SmallSEOTools is more of a plagiarism checker. It lets you identify which part of your content is not original or has been duplicated from the content that’s already available on the internet.
Vital Statistics on Duplicate Content
29% of webpages on the internet have duplicate content (Raven Tools).
80% of websites aren’t using microdata.
One of the biggest SEO pitfalls uncovered is the issue of duplicate content.
Of the pages with duplicate content, 22% of title tags and 17% of meta descriptions have duplicate content.
Schema microdata is all the rage. But only 20% of websites have successfully implemented it.
Only 36% of the results in the SERPs display schema mark-up.
83.13% of websites use Google Analytics to track their online performance.
An average site has about 4500 SEO-related issues – 250 of which are link-related, and 3672 have to do with the images used. Follow this link to read the full report.
The Bottom Line
Googlebot crawls sites multiple times per day; it can tell where the original article was published if it finds a copied version of an article a week later on another website, But, does it get angry and impose a penalty on the site? No. That’s basically everything that you need to know about duplicate content.
Conclusion
Even though duplicate content can affect your site ranking on search engine results pages, it is not as scary as most people perceive. Unless the reason you posted the content is to manipulate SERPs results, search engines will not typically impose a penalty. That does not mean that there are no adverse consequences of having such content on your site. It is recommendable to crawl your site and resolve such issues to be on the safe side.
Here at MediaOne we will assess your site, reduce the duplicate content where necessary and create strategies to add in new original content to help you score. Give us a call at 6789 9852 today!