Did you know that you get redirected on a very regular basis? Many people, including most web server admins it seems, don’t completely understand the makup of the URL that we use so often. Lets look at a URL and break down its components. We will be using one from this site.
First of course, we start with the protocol (HTTP). This is automatically added on if you leave it off. It is an instruction to the browser as to which type of communication should take place. Internet Explorer can communicate over several different protocols (FTP, LDAP, etc..) and at times need a hint as to how you want to connect to the remote computer.
Next is the DNS name of the computer (www.cmagic.biz). You could also put in the IP address, but who rememebers those? What is the IP address of google? The name makes it easy for people to remember and find your site again.
Next is the “/” character. Notice that I haven’t gotten to the folder yet (wordpress). The “/” character is an important part of the URL as it not only seperates one part from the next, but it also indicates that we are requesting a folder rather than an actual page. Remember, I could actually have a file called (wordpress) on my site. Should we show what is in the folder (wordpress) or the contents of the file named (wordpress)? We don’t very often have files and folders with the same name. We also generally add an extension after the file name (.html, .php, .asp, etc…). The web server itself doesn’t really care if a file has an extension or not (other than to know how to do any special parsing).
When you go to (www.cmagic.biz), you are asking for the web root directory on that server (web root is the root of the web file system, but will almost never be the real root of the web servers file system, that is why we refer to it as the web root). When you request (www.cmagic.biz), any properly configured server will actually redirct you. Asking for the server (www.cmagic.biz) isn’t enough. It needs to know what page you want displayed (or show you a directory listing).
To keep from confusing you, the web server will redirect to the following address (www.cmagic.biz/). Notice the trailing “/” character. This means “show the root directory on this site”. Without the “/” character, there is no indication which folder the content should actually come from. We can make the assumption that if you come to (www.cmagic.biz), that what we really want is the “/” directory.
Once the web server knows which directory to service, it has to either show a list of files and folders it contains (if directory browsing is on) or show a default document. Thus, when we go to (www.cmagic.biz), we actually are requesting (www.cmagic.biz/index.php), we just may not know it. It all happens automatically.
I had an issue the other day on a site where the redirection was not configured correctly. The site (http://edumed.mine.nu) worked ok until you clicked certain links. The best thing you can do when you design a site is always add trailing “/” characters to your links when you are linking to a folder (eg: /test/ instead of /test). With the trailing “/” already in place, your link will work regarless of the server configuration and may save you some headache. My problem is that when I built the site (a number of years ago) it was built on a server that was configured correctly. When I recently moved it, the new server would redirect ok (it added the trailing “/”), but it would add a (www) at the beginning of the URL. By doing so, some browsers see the redirect as a move to a different site (www.edumed.mine.nu could be a different box than edumed.mine.nu). This move would actually make some browsers think that the user had not logged in and show them a page error (you do not have permission to view this page). My possible solutions were to go through the whole site and update any and all links that didn’t have the trailing “/” in them (can you say tedious and time consuming) to avoid the redirect completely, or to have the configuration in Apache changed so that the (www) was not added to the beginning.
Why couldn’t I just send all my users to the (www) address instead? Too many users who had not used that address in the past. It would be too confusing and too time consuming to educatate all the users. Besides, the server was misconfigured, and it should be setup correctly!
It took some time and effort to convey the problem to the guy managing the server. Not that I am trying to criticize the serivce (they are great), but I had to actually send them the fix in the httpd.conf file before they could understand that it was an Apache configuration issue and not a user error. The problem was that while this was a 100% reproducable issue, it was obscure enough and the understanding of the URL was incomplete on the techs end. The Apache documentation describes this process fairly clearly (and also proves that it is a configuration issue).
Am I ranting about poor service? Definatly not! It is important to have a firm understanding of the software you are working with. A lack of such will only make your life more difficult. Hopefully this will spread the word on a poorly understood subject and save many hours of troubleshooting.
Want the Apache docs? Here are the links.
Computer Magic And Software Design