If you own or manage a website as a marketer, you may be familiar with the term ‘duplicate content.’ It refers to content that appears on multiple websites. Did you know that it can significantly affect your search engine optimization (SEO) efforts?
Duplicate content is a serious issue because search engines like Google prioritize original and relevant content. When search engines discover identical or substantially similar content on multiple websites, they have difficulty determining which version is the authoritative source. This can lead to lower search engine rankings for websites with duplicate content.
As a leading SEO agency with a proven track record of dominating rankings on the first page of Google and driving client revenue growth over the past 8 years, we want to highlight four instances where we identified and resolved duplicate content issues using our advanced SEO strategies and techniques.
Read on to learn more about these four duplicate content issues and how our steps will help provide you with a clearer understanding of the necessary actions to take to resolve them.
Let’s get started!
Jump to:
- How to Spot / Know if you have Duplicate Content Issues?
- The 4 Different Types of Duplicate Content
- Importance of Spotting and Knowing Duplicate Content Issues
How to Spot / Know if you have Duplicate Content Issues?
Imagine having thousands of pages on your website; manually inspecting each for duplicate content is time-consuming.
Discovering whether your website has duplicate content issues can be done in several ways. However, the most efficient way is to use tools to help you easily identify and rectify any duplicate content issues on your website.
Let’s start with a free tool that everyone can utilize!
Google Search Console
Image source: Screenshot from Google Search Console
One of the best tools for identifying duplicate content on your website is Google Search Console. This free tool from Google allows you to track your website’s performance in search results and identify issues affecting your rankings.
To use Google Search Console to identify duplicate content, log in to your account and navigate to the “Pages” report.
Look for pages with the label “Duplicate content without user-selected canonical” This indicates that Google has discovered identical or very similar content on your website but has not yet determined which page is the original/canonical version.
You can then click on the URL to see which other pages on your site have similar content.
SEMRush
Image source: Screenshot from https://www.semrush.com/
To detect duplicate content on your website, you can use the Site Audit tool in SEMRush, but unlike Google Search Console, this tool is only accessible with a paid SEMRush subscription.
Here’s how you can do it:
- Log in to your SEMRush account and navigate to the Site Audit tool.
- In the project settings, enter your website’s URL and additional information required, then start the audit.
- Once the audit is complete, click on the “Issues” tab and look for the “Duplicate Content” report.
- Here, you can view a list of all the pages on your website that contain duplicate content, along with the percentage of duplicate content on each page.
- Click on any page to see the exact content that is duplicated and where it appears on your website.
AHREFs
Image source: Screenshot from https://ahrefs.com/
AHREFs is a tool similar to SEMRush, although it differs in how it presents its reports. AHREFs has a unique approach to presenting its reports that can provide additional insights for resolving duplicate content problems.
To view the duplicate content issues you have in AHREFs, here are some simple steps:
- Log in to your AHREFs account and hover to “Site Audit” in the navigation section above.
- Next, type your website’s URL and click “New Project.”
- Fill up and verify the additional information needed and wait for AHREFs to complete its site audit. Note that it will depend on the settings you pre-defined earlier on how long it will take for AHREFs to complete their audit.
- Once done, you can click the “Overview” tab to get all your website’s technical issues, and look for “Duplicate pages..”
The 4 Different Types of Duplicate Content
Duplicate Content as there are Multiple Pages/Posts Intentionally
Whether intentionally or unintentionally, this duplicate content refers to having the same content on one or more pages, which is a common problem that website owners/marketers face.
Here is an example of duplicate content pages with multiple pages/posts intentionally:
As you can see, Page A and Page B have the same content even though they have different links, but they have exactly the same content. Having identical content on multiple pages can confuse search engines and make it difficult for them to determine which page is the most relevant to show in search results.
This is especially true if duplicate content is intentionally created to manipulate search engine rankings.
Option 1: Avoid duplicate content by making each page unique.
This is best for a website with only a few duplicate-content issues. After using tools like SEMRUsh and AHREFs, you can now easily look at those intentionally duplicated pages and create unique content for Page A and Page B (see the example image above.)
What if you have hundreds/thousands of duplicate pages on your website?
Option 2: Use/Implement rel=canonical
The rel=canonical is an HTML attribute that tells search engines the preferred or original source of a piece of content and helps to avoid duplicate content issues. The rel=canonical tag helps avoid duplicate content issues by specifying the “canonical URL” or the preferred web page version.
Here is a step-by-step guide on how to set a canonical URL:
Step 1: Identify the Duplicate Content Pages
- The first step is to identify the pages on your website with duplicate content. This can be done by conducting a site audit or using a tool like SEMRush or AHREFs to crawl your website.
Step 2: Determine the Preferred URL
- Once you’ve identified the duplicate pages, you must determine the preferred URL for that content. This is the URL you want search engines to recognize as the primary source of the content.
Step 3: Implement the Canonical Tag (rel=canonical)
- After determining the preferred URL, you need to implement the canonical tag on the duplicate pages. This tag tells search engines that the content on the duplicate page is a copy of the content on the preferred URL.
- If you’re doing this manually, look for the head section of the HTML source code. Within the head section, add the rel=canonical tag in the following format:
- <link rel=”canonical” href=”https://www.yourpreferredurl.com“>
When using a plugin such as Yoast or RankMath, the process may differ from the manual implementation of the rel=canonical tag.
Step 4: Test and Verify
- Once you’ve implemented the canonical tag, it’s important to test and verify that it’s working correctly. You can use tools like Google Search Console to monitor your website’s performance and ensure that search engines recognize the canonical URLs.
Duplicate content as (domain.com vs. www.domain.com)
Another common issue is when your website can be accessed via the “www” and non-www versions. For example, “www.example.com” and “example.com” are considered two different websites by search engines, and any content on both sites will be seen as duplicate content.
Page A (https://www.) Page B (https://)
Essentially, search engines treat “www.domain.com” and “domain.com” as separate URLs, even though they may display the same content.
How to Fix this Duplicate Content Issue: Set Up Domain Forwarding
Domain forwarding ensures that all traffic is directed to the preferred version of the domain (either “www” or “non-www”), consolidating link equity and avoiding the negative impact of duplicate content. It’s a simple solution that can be implemented using user-friendly plugins available for installation, especially for WordPress websites.
Additionally, two types of redirects that can be used for this purpose are 301 redirects and 302 redirects. A 301 redirect is a permanent redirect that tells search engines that the preferred version of the domain has been permanently moved. In comparison, a 302 redirect is a temporary redirect that tells search engines that the preferred version of the domain has temporarily moved.
This not only helps avoid the negative impact of duplicate content, but it can also help maintain a seamless experience for website visitors at the same time.
How to Fix this Duplicate Content Issue: Implementing CNAME
A CNAME (Canonical Name) record efficiently resolves duplicate content issues caused by domain variations. It allows you to alias one domain name to another, ensuring that both www and non-www versions of your domain point to the same website.
To create a CNAME record, follow these steps:
- Identify the preferred version of your domain (either with www or without).
- Access your DNS settings through your domain registrar or hosting provider.
- Add a new CNAME record or update the existing one using the following format:
- Host: [non-preferred version]
- Type: CNAME
- Points to: [preferred version]
- TTL: [time-to-live, usually in seconds or minutes]
- Host: domain.com
- Type: CNAME
- Points to: www.domain.com
- TTL: 3600
- Save your changes and allow some time for the DNS to propagate.
- Test the CNAME record using online tools such as DNSChecker, or command line tools like ‘dig’ or ‘nslookup’ to ensure the record resolves correctly.
- Monitor your website traffic to confirm that the CNAME
If you require assistance with this process, Google has provided a helpful article on setting up CNAME records for your website.
Duplicate content as (URL with or without (/))
This duplicate content issue happens in a domain, for example: “https://domain.com” and “https://domain.com/.”
The issue arises because search engines see “https://domain.com” and “https://domain.com/” as two different pages, which may be indexed separately.
This significantly can affect your site when you have several external links pointing to both pages (with and without trailing slash). This presents a problem for search engines, trying to understand which URL they need to display in the search results pages.
Oftentimes, these two URLs end up competing against each other, effectively splitting the power of each page and hurting your site’s ability to rank organically in competitive niches.
Here is an article from Google itself about this issue.
How to fix this duplicate content issue: (WordPress websites)
Take this step if your website is hosted in WordPress.
Go to Settings > Permalinks. You can change whether you use a trailing slash if you use a custom structure.
- /%postname%/ would add the trailing slash to URLs
- /%postname% would remove the trailing slash from URLs
How to fix this duplicate content issue: .htaccessIf you prefer to remove the slash:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ /$1 [L,R=301]
SIDENOTE. !-f look for a directory; If one exists, it won’t remove the slash. If you don’t include this, you may break these main directory pages.
If you prefer to add a slash:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*[^/])$ /$1/ [L,R=301]
SIDENOTE. !-f looks for a file; if it exists, it doesn’t add the trailing slash. This keeps images, PDFs, JS, CSS, etc., from breaking.
Duplicate content as pagination (pages 1, 2, 3)
This duplicate content issue often occurs on e-commerce websites. If you have a large amount of content on your website, you may need to split it across multiple pages based on the products your business offers (maybe a product in different colors, for example), resulting in multiple pages that display almost the same content.
How to fix this duplicate content issue: Implement rel=canonical or Load More JS
When addressing duplicate content issues due to pagination, implementing rel=canonical can be an effective strategy for consolidating link equity and avoiding duplicate content.
“Load-more JS” is a JavaScript script that dynamically loads additional content onto a web page without requiring the user to navigate to a new page or reload the current page.
This technique is commonly used as an alternative to traditional pagination. A long list of content is broken into separate pages that the user must navigate by clicking on page numbers or “next” and “previous” links.
Importance of Spotting and Knowing Duplicate Content Issues
Duplicate content poses a significant challenge for website owners/marketers, as it can result in adverse outcomes such as lower search engine rankings and decreased user engagement. However, by identifying the type of duplicate content you are dealing with and implementing the proper solutions, you can enhance your website’s user experience and attract more organic traffic.
Therefore, it is crucial to take action promptly and ensure that your website stands out from the competition. Don’t let duplicate content hold you back from achieving your online goals.
Having issues with duplicate technical SEO? Let’s discuss how we can help you.