Canonical URLs: Telling Search Engines which to Use
There is more than one URL that can be used to access any given website page. While this is extremely useful in certain situations, like when you want to view your page before the domain propagates, it can sometimes cause undesired effects. Luckily most search engines, like Google and the Bing Network, understand that every page has multiple working URLs; so they are usually able to correctly determine what URL is most likely the canonical URL. Simply put, the canonical URL is the preferred URL for a page.
Examples of URLs for a Single Page
Depending on the setup of a particular website, it may be possible to visit the same page by using the following example URLs:
If the domain is an addon domain, it may be possible to view the same page through these hypothetical URLs:
How Search Engines Guess the Canonical (Preferred) URL
It is important to note that even though several URLs exist for the same file, most search engines (as well as your visitors) will never encounter them or even know that they exist. Search engines are only able to find URLs in a couple of ways:
- They found a link to your page on a web page they already knew existed.
- They found the link or URL in a site map and/or an RSS feed.
- The link or URL was submitted to them directly, usually via their website.
- Someone visited your page while using that search engine's browser toolbar.
Once search engines find out about the page, they compare it with other pages that appear to be identical or almost exact matches in order to spot duplicates. If they spot a duplicate page, they then try to figure out which URL should be the canonical URL. Although they keep their exact algorithms secret, there are some things that they are known to check:
- Most common URL used to link to the site
- What URL is used in the site map and RSS feeds
- Whether a canonical URL is specified in the meta tags of site pages
- Whether the URL redirects to another URL
- For Google specifically, whether a canonical URL is specified in Google Webmaster Tools
Here are some useful resources from Google regarding canonical URLs: