(Site Identification)

'Eye' Focus: Web Support Tutorials

Dreaded 'NOT FOUND' Errors

Dilemma:

Q: Concerned with the loss of visitors to your site? Want some help tracking down those pesky 404 errors? Need to improve your site's effectiveness? Do you need to relocate files but you have numerous web sites and search engines linked to your information?


A: You might need to use a simple web server trick to trap those unwanted errors. At the same time you can enlist the aid of your visitors to keep your site clean and trim.


(to Top) Solution:

The method is simple. You need two files: err404.html and .htaccess.

err404.html

This is your custom error page. Depending on your abilities as a webmaster, this page can be a simple site map page or a complex means of getting your visitor the information they need.

Basically, this page would tell your visitor that the information they were looking for could not be found. It should also provide links &/or a search form where they might actually find the information they wanted. What is really useful here is if you offer them a way to inform you of the error. What you need is the 'what' and 'where' of the error, and that is very easy to get with a simple HTML form. Then your error page becomes as much an aid to you as it does your visitor.

.htaccess

This file contains one line per error type that the server should trap. The most common error you would want to trap is the 404 errors. Here are some error numbers you might wish to trap:

401 - Unauthorized. The visitor asked for a document in a section that requires web authentication but did not provide a valid username or password. Special arrangements are required for a web access login to your instruction or copyrighted information section of your site. If you have one of these sections in your site, you can provide a friendly error message to the visitor explaining what is required to gain access to your protected site. This is a much better method than telling them that they are 'Unauthorized' to see your information (as though they were trying to break in), they may simply have missed some crucial steps required to gain access.

403 - Forbidden. Access is explicitly denied to this document. This might happen because the web server doesn't have read permission for the file being requested. This can also occur on the server when a request is made to a directory where one of the default files could not be located:

  1. index.htm
  2. index.html
  3. index.cfm
  4. index.php
  5. etc.

Directory browsing (browsing a directories contents without the aid of a web page) is generally turned off on a web server. If one of these files are not present in a directory, a 403 error would normally be sent to the visitor. You can prevent this error with a custom 403 error page that would direct your visitor to a more informative page.

404 - Not found. In some cases, the file may actually exist but the server is protecting it by telling an unauthorized user that the file does not exist, i.e. the file has been locked by the owner. Most often than not, the requested page does not actually exist. Whether the visitor followed a link that was wrong, or they used an invalid URL, is beside the point at hand. We want the visitor to ALWAYS find our web site, not an arbitrary error message.

Example: (this all goes on one line) (skip code)
ErrorDocument 404 /data/notfound.htm

The error trapping applies to the contents of the directory, and sub-directories, in which this htaccess file appears. Each line in this file contains three parts; the error trap statement, the error number to trap, and the URL to send your visitor to when the error is encountered. This line (as is) would send your visitor to a 404 error page found on the web server located at /data/notfound.htm. However, to use your own error page you would simply change the address to your own error page.

[Remember to use the full address path and test to see that it works! If you refer to a non-existent page within your site your visitor will NEVER get ANY information and the web server will be tied up following your invalid reference, infinitely.]

To help search engines index your site properly with site changes or moves, do not forget to use the correct META tag in the HEAD of your new error page. The tag should look like this: (skip code)
<meta name="robots" content="noindex,follow">

This tag tells a search engine to ignore the content on this page but continue to index the pages it finds from the links listed here.

(to Top) A Real Example

To see a working example of the method outlined above, try visiting the Food-Resource web site. After you arrive at the site add anything you desire to the web address (URL). Unless you happen to type in a valid URL you will always see the site's custom error page, but will remain at the address you entered.

Don't forget to check out the source of the error page. (Viewing the original source directly via FTP is best since some of the code is interpreted by the web server before it is sent to the visitor's browser.)

This error page collects information about the visitor's browser when the form is submitted. It also collects the address of the referring page to assist with the location of the error. This method allows you to determine whether the error was related to your site or the visitor's browser. It's just some additional ways of tracing errors in your site.

(to Top) Other HTACCESS Information Sources

Note: If you are looking to password protect your site using HTACCESS you will need to contact our office to have that set up. Generally this functionality is not available for ONID and personal web sites, but it is available for departmental and course sites. We can be contacted via the contact information at the bottom of the page.


[Updated: Sunday, November 18, 2007]