Lots of people think they know something about SEO, which is cool. Every webmaster on the planet should know something about it, just so they’re doing the basic things, such as performing keyword research, filling in their Meta data for every page on their sites, and getting backlinks. Simple 101 stuff. But then, there are the techniques that are slightly advanced and lots of webmasters get them wrong. Take the rel=”canonical” tag, for example, which is recognized by all the major search engines.
This tag should be used in your header information to indicate that if there are duplicate pages on your website, you want Google to know which is the right page. Asking them to index 2 identical pages without regard to which is the “control” page or dominant page could spell trouble for your site. Having two pages with the same content with little variation (such as a headline when you’re running a split test) should have rel=”canonical” involved.
But… There are times when you might use rel=”canonical” in the wrong way.
I just read a great article over at the Google Webmaster Central blog about this, and you should read it, too, if you’re the least bit techy. If not, let me put some of that into terms anyone can understand.
First, here’s what a rel=”canonical” tag should be like:
Let’s say you’re running an A/B split test and you want your control page to be the dominant, at least until you see which page wins in the test. Here’s what the tag should look like on that dominant page and on the test page:
<link rel=”canonical” href=”http://www.domain.com/page.html/”>
On the control page, the URL would be its own link. On the test page, the URL would be the control page link.
Simple enough, right?
Google also says:
“The rel=canonical link consolidates indexing properties from the duplicates, like their inbound links, as well as specifies which URL you’d like displayed in search results.
Well… There are some situations where webmasters can be confused. Here are five places that the Google article mentioned:
- If you have an article that spans three web pages, you shouldn’t declare that page 1 is the canonical, except on that page. Each of those article pages is unique and if you make page 1 canonical and repeat that same link on page 2 and page 3, pages 2 and 3 may never be indexed. You want all of the pages indexed!
- The canonical version of a page should be very similar to the other page that you’re telling search engines to ignore.
- Be sure that the canonical version of the page actually exists. So, if you are running a test and you have the control page as the canonical page, but the test page wins. Shift the canonical to the winning page, which then becomes control. Then, add a new canonical tag to the next test page for the new control. Otherwise, the losing control page will continue to be the indexed version in search. Or, if you deleted it, the canonical link will be broken. This is VERY important to get the proper search results.
- Make sure that the page that rel=canonical applies to doesn’t have a no index, no follow robots tag and that it’s not excluded in robots.txt. Robots.txt is a file that tells search spiders where they aren’t allowed to go on your site. Having a canonical page be one that you’re telling search spiders to stay away from isn’t a good idea.
- You should include the rel=canonical tag in the section of your HTML page, or in the case of WordPress, you can set a canonical URL using WordPress SEO by Yoast. It’s in the “Advanced” tab of the section on your page where that plugin allows you to write titles and descriptions for each post you make.
- If you specify more than one rel=canonical, spiders will pick one and ignore the rest. Only ONE tag should be set.
Doing any of the above is tantamount to shooting yourself in the foot. If you’re adventurous, I encourage you to read the article. Or, if you have questions, just let me know what they are below. Happy to help!