The ‘canonical’ URL of a document means its true URL; although a document could theoretically be returned on a range of valid URLs, the canonical URL should always be returned alongside the content.
There are many cases where you should return the document on an alternative URL, such as when tracking or session variables are added onto the end of the URL, or when the content is just rearranged due to a sorting variable being added. In these cases, you can set a canonical signal to tell the search robot where the original version of this document exists. This ensures that all authority, link, trust and relevance signals are amalgamated into the canonical version of the document rather than being diluted across a range of URLs.
|Request URL||Canonical URL|
It is best practice to include this on every page as a fail-safe in case this page is linked to using an alternative URL (common causes are capitalisations, adding a trailing slash or the addition of tracking or session variables to the end of the URL).
This can either be implemented within the HTTP header or HTML head.
Note: If more than one canonical element is present on a page, all are likely to be ignored by search indexers.
Link: <http://www.example.com/product.php?item=swedish-fish>; rel="canonical"
<link rel=“canonical” href="http://www.example.com/product.php?item=swedish-fish"/>
Common Canonical Mistakes
The canonical does not always equal the request URL.
A common mistake is to set the canonical URL as the client’s request URL. This is an easy implementation and looks correct when the page is called from a valid URL, however this approach causes every request URL to become a canonical URL, quickly bloating the website to millions of valid pages and opening the website up to potential negative SEO attacks.
A developer may set the canonical tag to be populated using the request URI. By doing this, a search engine may follow a link with an extra variable (such as a tracking variable). Here, we get duplication caused by these variables:
The canonical does not equal the filename.
The polar opposite of using the request URL to using only the filename of the document or cutting out all variables, meaning that some legitimate pages canonicalise back to their parent.