Canonical Tags Implementation Guide for Developers

What is a Canonical Tag?

A canonical tag is a way of telling search engines that a specific URL represents the master copy of a page. We would use this where there are instances of duplicate or near-identical content existing on multiple URLs, that are unable to be resolved through another method such as rewriting content or implementing a redirect. Essentially, it is a directive that tells search engines which version of a URL you want to appear in search results.

Any content which is classed by search engines as being duplicate or extremely similar can have negative effects on rankings and user experience. Just a few instances of duplicate content can trigger Google to rank your site lower in the search results. You will be unable to recover your rankings until you’ve addressed the duplicate content. If you don’t tell Google which version of your content should be indexed, then Google will make that choice for you and the duplicate versions will be crawled less often. Sometimes this can lead to the wrong URL being indexed. A canonical tag can either be self-referencing (where a canonical tag points to a page’s own URL) or can reference another page’s URL to consolidate signals.

Remember – even if you specify a canonical URL, Google may choose a different page as canonical, for various reasons. A canonical tag is a hint not a directive.

Canonical Tags vs 301 Redirect vs Unindex

You should 301 redirect rather than canonicalise the following:

  • HTTP to HTTPS
  • Non-www to www (or vice versa)
  • Non-trailing slash to trailing slash (or vice versa)
  • When you delete a page or move it from URL A to URL B
  • You move to a new domain name
  • If you change your URL structure

Google doesn’t recommend blocking pages via robots.txt or noindexing the content, instead, it is better to let them crawl it and then mark them with a canonical tag:

Valid Reasons for Duplicate Content

There are a few valid reasons why you might have different URLs pointing to the same page:

To support different device types where the site serves different content:

https://example.com/news/koala-rampage
https://m.example.com/news/koala-rampage
https://amp.example.com/news/koala-rampage
To enable dynamic URLs for search parameters, tracking parameters or session IDs:

https://www.example.com/products?category=dresses&color=green
https://example.com/dresses/cocktail?gclid=ABCD
https://www.example.com/dresses/green/greendress.html
If your site saves multiple URLs for the same post or content e.g. Shopify:

https://www.website.com/products/product-a
https://www.website.com/featured-collection/products/product-a
https://www.website.com/sales-collection/products/product-a
All of these can cause duplicate content if they are not canonicalised to the main URL.

Canonical Tags vs 301 Redirect vs Unindex

You should 301 redirect rather than canonicalise the following:

  • HTTP to HTTPS
  • Non-www to www (or vice versa)
  • Non-trailing slash to trailing slash (or vice versa)
  • When you delete a page or move it from URL A to URL B
  • You move to a new domain name
  • If you change your URL structure

Google doesn’t recommend blocking pages via robots.txt or noindexing the content, instead, it is better to let them crawl it and then mark them with a canonical tag:

Valid Reasons for Duplicate Content

There are a few valid reasons why you might have different URLs pointing to the same page:

To support different device types where the site serves different content:

https://example.com/news/koala-rampage
https://m.example.com/news/koala-rampage
https://amp.example.com/news/koala-rampage
To enable dynamic URLs for search parameters, tracking parameters or session IDs:

https://www.example.com/products?category=dresses&color=green
https://example.com/dresses/cocktail?gclid=ABCD
https://www.example.com/dresses/green/greendress.html
If your site saves multiple URLs for the same post or content e.g. Shopify:

https://www.website.com/products/product-a
https://www.website.com/featured-collection/products/product-a
https://www.website.com/sales-collection/products/product-a
All of these can cause duplicate content if they are not canonicalised to the main URL.

Canonical Tag Example

So for example, say you have these 3 duplicate pages:

https://example.com/dresses/green-dresses
https://example.com/dresses?colour=green
https://example.com/dresses/women/green

You would (via Analytics and other methods) determine which is the “strongest” of these 3 pages – this would be your canonical page (not the one with parameters!)

You need to mark the all 3 pages with the same rel=”canonical” link tag, like this:

<link rel=”canonical” href=”https://example.com/dresses/green-dresses” />

This points Google to the strong version of the content and tells it that it should only index that one.

So in this case, the strongest page has a self-canonical tag and the other two URLs are canonicalised to the main one:

https://example.com/dresses/green-dresses has a self-canonical of https://example.com/dresses/green-dresses
https://example.com/dresses?colour=green is canonicalised to https://example.com/dresses/green-dresses
https://example.com/dresses/women/green is canonicalised to https://example.com/dresses/green-dresses

This means that https://example.com/dresses/green-dresses should be indexed by Google.

How to Implement Canonical Tags

There are 2 main ways to implement canonical tags, the most common of which is to use a meta tag in thepart of the page.

rel=canonical tag
Add a tag in the code for all duplicate pages, pointing to the canonical page.

Pros:

– Can map an infinite number of duplicate pages.

Cons:

– Can be complex to maintain the mapping on larger sites, or sites where the URLs change often.
– Only works for HTML pages, not for files such as PDF. In such cases, you can use the rel=canonical HTTP header.

The tag looks like this:

Canonical tags only go in thepart of the page, it cannot go in theor anywhere else.

Canonical HTTP header

If you have access to your server and configure it, you can send a rel=canonical header in your page response rather than a HTML tag. This is especially useful for non-HTML documents such as PDF files.

Date: Tue, 09 Aug 2022 17:18:12 GMT
Server: Apache/2.2.21 (Win64) PHP/5.3.8
X-Powered-By: PHP/5.3.8
Link: <https://www.mywebsite.co.uk/page.html>; rel=”canonical”

Content-Length: 8
Keep-Alive: timeout=5 max=99
Connection: Keep-Alive
Content-Type: text/html

Pros:

– Can map an infinite number of duplicate pages.

Cons:

– Can be complex to maintain the mapping on larger sites, or sites where the URLs change often.Canonical HTTP header

How to Implement Canonical Tags

There are 2 main ways to implement canonical tags, the most common of which is to use a meta tag in thepart of the page.

rel=canonical tag
Add a tag in the code for all duplicate pages, pointing to the canonical page.

Pros:

– Can map an infinite number of duplicate pages.

Cons:

– Can be complex to maintain the mapping on larger sites, or sites where the URLs change often.
– Only works for HTML pages, not for files such as PDF. In such cases, you can use the rel=canonical HTTP header.

The tag looks like this:

Canonical tags only go in thepart of the page, it cannot go in theor anywhere else.

Canonical HTTP header

If you have access to your server and configure it, you can send a rel=canonical header in your page response rather than a HTML tag. This is especially useful for non-HTML documents such as PDF files.

Date: Tue, 09 Aug 2022 17:18:12 GMT
Server: Apache/2.2.21 (Win64) PHP/5.3.8
X-Powered-By: PHP/5.3.8
Link: <https://www.mywebsite.co.uk/page.html>; rel=”canonical”

Content-Length: 8
Keep-Alive: timeout=5 max=99
Connection: Keep-Alive
Content-Type: text/html

Pros:

– Can map an infinite number of duplicate pages.

Cons:

– Can be complex to maintain the mapping on larger sites, or sites where the URLs change often.Canonical HTTP header

Using PHP

Adding this header() function before any HTML is output will append a link rel=”canonical” HTTP header to the headers before they get sent.

header( ‘Link: <https://www.mywebsite.com/page.html>; rel=”canonical”’ );

Using .htaccess

The HTTP Header can be modified relatively easily using .htaccess for all content-types, such as PDF files. This solution works well for sites that have a relatively small amount of files which you need the header added to.

<Files “file.pdf”> Header add Link “<https://www.mywebsite.com/page.html>; rel=\”canonical\””</Files>

Simply replace file.pdf with the name of your pdf file and then page.html with the canonical URL for the PDF like this:

<Files “file-to-canonicalize.pdf”>

Header add Link “< http://www.website.com/canonical-page/>; rel=\”canonical\””

</Files>

Discover More

What is Google’s AI Mode?

What is Google’s AI Mode?

Instead of showing a page of links, Google’s Gemini-powered AI now delivers complete, conversational responses built from sources it trusts most. This means your goal is no longer to rank number one, it’s to be cited inside the AI’s answer box.

In this blog, we break down what AI Mode is, how it’s reshaping user behaviour, and the strategies your business can use to stay visible in an AI-first world — from Generative Engine Optimisation to entity building and trust signals.

read more

ENJOYED THIS BLOG?

SHARE IT ON SOCIAL