Computer Magic
Software Design Just For You
 
 



Case insensitive URLs with Apache

On a recent project, the request came in to allow for case insensitive URL’s. The site is housed on a *nix box running Apache. Since *nix systems don’t support case insensitive file/folder names, it won’t work natively. To accomodate this request, we turned to our most complicated and obscure friend ModRewrite.

ModRewrite allows you to mess with URLs. You can redirect the requests internally in all sorts of interesting and creative ways. You can even do an external redirect (where the browser actually is instructed to make the request for the new URL). You can even take advantage of regular expressions for pattern matching.

Our project was in a folder (e.g. www.mysite.com/test). The client wanted to be able to put in (www.mysite.com/TEST) or (www.mysite.com/Test) or any other combination. To setup ModRewrite rules, you can either put them in the main http.conf file, or you can put them in an .htaccess file. One thing to consider though is that ModRewrite works from the current directory on. You can’t change the part of the path previous to your current location. This means that you can rewrite the index.html part of (www.mysite.com/test/index.html) but not the test part if you put the .htaccess file inside the test directory. To modify the test directory, you need to put the .htaccess file in its parent folder. This is important. If you don’t your rule won’t work or it will cause a 500 error. I fought with this for about 15 minutes before I banged my head against the screen and said doh in my most Homer like impression.

Placing the .htaccess file in the root of the web server, I can put in this rule…



RewriteEngine On
RewriteRule ^test/(.+)$ test/$1 [nc]

The first item (RewriteEngine On) makes sure that the ModRewrite engine is running. If it isn’t, the rule won’t work.

The next item is the actual rule. RewriteRule just says “the rule follows”. The ^ character says this has to match the beginning of the URL. Notice that I didn’t use the beginning slash (/). Depending on your RewriteBase this might change.

The rule says, “All URLs beginning with test, that have a slash (/), and then followed by anything will match”. The $ sign here means the end of the url.

Once we establish that something matches, we use the next part of the rule (after the space). This is where to redirect to. Default is an internal redirect and the browser doesn’t know any better.

If we have a match, we redirect to the test folder (notice all lowercase, this needs to match your actual case for your folder). After the slash, there is the $1. For every (???) section in the previous part of the rule, you get a $ variable. They are assigned in order. So, since we have (.+), $1 becomes whatever that matched. This is a wild card and allows us to shoot requests for (/TEST/about.html) to (/test/about.html).

The last part is options that are applied when matching. By putting nc in the brackets, we are saying “use a non case sensitive match” so TEST and Test will match test when we are checking for matches. This is what gives us our case insensitive matching.

One hangup with this rule is that it doesn’t deal with requests where a file isn’t specified (e.g. www.mysite.com/test/index.html v.s. www.mysite.com/test/). If you leave the file name off, it doesn’t know what to do since the $1 doesn’t equal anything now. I added a couple of rules to account for this. I am sure you could make it work right by tweeking the original rule some more, but this was easier for me and it works just fine. Here are the additional rules to handle the (/) issue.



RewriteRule ^test$ http://www.mysite.com/test/index.html [r=300,nc]
RewriteRule ^test/$ http://www.mysite.com/test/index.html [r=300,nc]

These rules are basically the same, except we aren’t looking for any old file name after the (/) character, we are specifically looking for no file name after the (/) character. We also have a second rule to handle no (/) character at all (www.mysite.com/test/ or www.mysite.com/test). Our default page is index.html, so we will redirect to there. In this case, we do a full redirect in which we send the browser a response of 300 (the r=300) telling it to request the specified URL. This is an external redirect in which the browser requests this URL, then is told to request a new URL. The previous rule does the translation in the server and the browser never knows the difference. Again, the nc option specifies non case sensitive matches.

If you have trouble making this work, make sure that your .htaccess file is in the right location. You can also try adding the (/) character in front of your rules (e.g. ^/test instead of ^test). I think this changes depending on how your rewrite base is set.

This could be modified to work for all URLs instead of just a test directory. I am sure that some one with more time or experience could find a better way to do this, but I had trouble finding good examples online for this and hope that this will help others avoid a little frustration.

Ray Pulsipher

Owner

Computer Magic And Software Design

Comments are closed.


Home | My Blog | Products | Edumed | About Us | Portfolio | Services | Location | Contact Us | Embedded Python | College Courses | Quick Scan | Web Spy | EZ Auction | Web Hosting
This page has been viewed 830327 times.

Copyright © 2005 Computer Magic And Software Design
(360) 417-6844
computermagic@hotmail.com
computer magic