This will most definitely be one of the most difficult topics to understand within the topic of learning on-page SEO.
Also while we are on the subject how to pronounce canonical “kuh-non-i-kuh l.”
When thinking about rel=canonical, the best way to think about this is to think of pages with rel=canonical as the “master” version or the “original” version.
Many on-page SEO guides attempt to explain canonicalization, but few actually do the topic justice.To understand why you need to implement this tag, first you need to understand how URL’s and permalinks work.
Content management systems and eCommerce frameworks such as WordPress and Shopify make life so easy, but can be a nightmare when it comes to showing the same content on multiple pages. For instance, these example all URLs all might display the same products on an eCommerce store:
Now imagine a conversation between your website and Googlebot:
Your website: My website sometimes creates multiple versions of the same page
Googlebot: Ok, but how am I supposed to tell which one is the original?
Your website: I’ll insert a “rel=canonical” tag at the top of the original version of the page
Googlebot: Sounds good to me
So how do we go about doing this? For a normal website if you want to implement the rel=canonical tag
For websites that have a CMS or eCommerce stores, you most likely already have a system to implement this such as a plugin or addon which makes life much easier. Other plugins will actually make bulk determinations based on known-issues within certain frameworks.
For instance in WordPress, category, tag, and archive pages tend to produce duplicate pages so a lot of canonical plugins will ask you if you want these pages canonicalized.
Cross Domain Canonicalization
Just like there is on-page and off-page SEO, there is on-site and off-site canonicalization as well.
This topic is actually a tad easier to grasp. Let’s say you have two versions of the same blog post, the first one is on your website, the other one is published on the New York Times.
Since the New York Times version of the post would technically be considered duplicate content, we would ask them to add the rel=canonical tag, pointing back to our website. In essence, this tells Google bot “Hey, the real version is actually on this site, ignore the New York Times version. Thanks!”
In short, the rel=canonical tag can help you with duplicate content or syndicated content on other websites. There is on catch: you have to have control of those websites. So let’s say you decide to publish content on LinkedIn you are out of luck because you can’t edit LinkedIn’s HTML header unfortunately.
Rel=Canonical HTTP headers
Another way to send the rel=canonical signal to Google is through your web server. This implementation is a little bit more difficult to implement, and has a few pro’s and con’s. There are a few pro’s and a lot of cons.
On the pro’s side of the equation, the rel=canonical http header is great because you can canonicalize resources such as PDF’s and other resources that aren’t HTML editable.
There are a few obvious cons:
- it is much more difficult to implement than adding the rel=canonical tag to your website page
- it may be difficult to get access to your website
- if you don’t implement correctly, you could produce entire website errors
- it is much more module
- it may not be available at all on your current setup
But fear not, chances are this isn’t that big of a deal. Unless your website is really PDF heavy and you have a lot of them scattered throughout your website and off-site, this shouldn’t be a problem.