Consolidating Page Requests to a Single Preferred Domain for SEO

by Chris Maresco
President, CAM Development

The Problem

Consider when someone requests the default page on your website. In most cases it can be accessed by any of the following URL’s:

  • http://www.yourdomain.com
  • http://www.yourdomain.com/index.htm
  • http://www.yourdomain.com/index.html
  • http://yourdomain.com
  • http://yourdomain.com/index.htm
  • http://yourdomain.com/index.html
  • … and maybe even other default filenames (such as default.html) or similar URL’s ending in PHP or other extensions.

    When someone visits your website using any one of these URL’s, Google and other search engines treat each distinctly, even if they all lead to the exact same content. This means, not only does it make tracking your visitors more complicated, it can lead to a duplicate content penalty and/or reduce your page rank.

    The Solution

    There are several ways to handle this problem, but the solution I present below is a good general purpose solution that is widely used and helps any search engine consolidate your content. More information on other methods can be found at https://support.google.com/webmasters/answer/139066. The solution I choose is to pick a preferred domain format (http://www.yourdomain.com) and redirect all requests to it. You can do this on Appache using .htaccess and mod_rewrite as follows (replace yourdomain.com with your actual domain name):

    [code]
    RewriteEngine On
    RewriteCond %{HTTP_HOST} ^yourdomain\.com [NC]
    RewriteRule ^(.*)$ http://www.yourdomain.com/$1 [L,R=301]
    RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*(default|index)\.(html|php|htm)\ HTTP/
    RewriteRule ^(([^/]+/)*)(default|index)\.(html|php|htm)$ http://www.domain.com/$1 [L,R=301]
    [/code]

    An in-depth explanation of how the redirection works is beyond the scope of this article, but simply put, it redirects the bare domain to the www domain and redirects the common default filenames (index.htm, default.htm, index.html, default.html, index.php and default.php) to an URL without a page name at all. All other page requests (even in sub-directories) are redirected to the www domain plus the page name.

    For example:

  • http://yourdomain.com –> http://www.yourdomain.com
  • http://www.yourdomain.com/index.htm –>http://www.yourdomain.com
  • http://yourdomain.com/sub/page.htm –> http://www.yourdomain.com/sub/page.htm
  • In addition to redirecting you should tell Google which domain you prefer. This helps them resolve links that they find while crawling to the proper domain and defines the format for your search listing. For more information see https://support.google.com/webmasters/answer/44231

    Taking it Further

    Recent news from Google suggests that converting your site to all https will give you an SEO bump, plus it gives your visitors a better sense of security. To do this you will need to obtain a certificate from a certificate authority such as Verisign/Symantec or Comodo, and install it to enable SSL on your website. You then need to redirect requests to the new https links and tell Google that you have moved your domain (see https://support.google.com/webmasters/answer/83106). While I won’t go into all this here, the same problem described above with www and non-www domains arise when distinguishing between http and https links.

    With slight modification, the technique outlined above can also be used to redirect existing traffic to https links. The following code will redirect all http links to https:

    [code]
    RewriteCond %{HTTPS} off
    RewriteRule ^(.*)$ https://www.yourdomain.com/$1 [L,R=301]
    [/code]

    This can be worked into the redirection above or added as an additional redirect condition.

    Summary

    Consolidating all variations of page requests on your website to a single preferred format is fairly simple to do, makes tracking easier, helps SEO and reduces the possibility of a duplicate content penalty. Additionally, redirecting all requests to https can further benefit SEO and provide your visits with more security.

    Chris Maresco is the President of CAM Development which develops business card and label printing software at www.camdevelopment.com

    BackBlaze