Site Moved

This site has been moved to a new location - Bin-Blog. All new post will appear at the new location.

Bin-Blog

mod_rewrite module for Apache

Along with the ability to password protect folders on the web server, another major use of the .htaccess file is the ability to 'rewrite' the URL. This will let you create more structured and easy to remember URLs. This module will redirect the user to one page while showing another URL in the address bar.

Application of mod_rewrite

Used in del.icio.us

This method is used to great effect in sites like Wikipedia and Del.icio.us. For example let us take a del.icio.us URL...

http://del.icio.us/binblog/javascript+ajax

This does not mean that there is a folder called binblog in the root of del.icio.us site. Nor does this mean that there is a file with the name 'javascript+ajax'. This trick is done using URL manipulation. In this example, the URL can be split into three parts....

http://del.icio.us/binblog/javascript+ajax
         ^^^         ^^^          ^^^           
      Site URL     User ID        Tags
    del.icio.us    binblog    javascript and ajax

The actual URL being called may be something like...

http://del.icio.us/show_bookmarks.php?format=html&user=binblog&tags=javascript+ajax

So how do the user see one URL and the server use another? That is the subtle art of URL manipulation. Before we see the details of this method, a small warning. You may not be able to get the concept at the first glance - it may be sometime before you understand mod_rewrite completely. So - don't give up, grasshopper. As one person puts it...

Despite the tons of examples and docs, mod_rewrite is voodoo. Damned cool voodoo, but still voodoo.
Brian Moore

Before going any further, let me also warn you that you must know regular expression to understand how this works. OK, now we can go further.

Using mod_rewrite

First we need a regular expression to extract the necessary elements from the URL. We will ignore the site URL(http://del.icio.us/) part. The URL is

http://del.icio.us/binblog/javascript+ajax

The Regualar Expression is...

^([^\/]+)\/([^\/]+)$

The extracted strings will be...

$1 = binblog (First Match)
$2 = javascript+ajax (Second Match)

To make this effect using the mod_rewrite module, open the .htaccess file your favorite editor and type in the following lines...

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule ^([^\/]+)\/([^\/]*)$ show_bookmarks.php?format=html&user=$1&tags=$2
</IfModule>

Now to test this, create a file called 'show_bookmarks.php' in the document root of your web server. I hope that I don't have to tell you that you need a LAMP setup to do this. You will need atleast Apache and PHP to do this.

After creating the 'show_bookmarks.php' file, enter the following code into it...

<pre><?php print_r($_GET); ?></pre>

Since this is just a test, I did not bother with all the HTML tags like <html>,<body> etc. But if you are a sticker for the rules, go ahead and make the full document.

Next, open this URL in the browser...

http://localhost/show_bookmarks.php?format=html&user=binblog&tags=javascript+ajax

You should get this result - the contests of the GET request...

Array
(
    [format] => html
    [user] => binblog
    [tags] => javascript ajax
)

Now try it with the URL...

http://localhost/binblog/javascript+ajax

If everything went well, this also should have the same output - that is...

Array
(
    [format] => html
    [user] => binblog
    [tags] => javascript ajax
)

Explanation

IfModule

<IfModule mod_rewrite.c>

This is an if condition - the code inside these tags will only be executed if the mod_rewrite module is loaded with apache.

RewriteEngine

RewriteEngine On

The 'RewriteEngine' directive enables or disables runtime rewriting engine. Here we are turning on the re-write Engine. Use the value 'off' if you want to turn of all rewriting.

RewriteRule

RewriteRule ^([^\/]+)\/([^\/]*)$ show_bookmarks.php?format=html&user=$1&tags=$2

This is the important statement. The proper syntax for RewriteRule directive is given below...

RewriteRule Pattern Substitution

The Pattern is a perl compactable regular expression.

Substitution part of the rewriting rule is the string which replaces the original URL for which Pattern has matched. You can use $n to insert regular expression captured strings - $1 is the first capture, $2 will be the second and so on.

One than one line of RewriteRule can be used. The order of useage is important as the second line will use the result of the first substitution as its input

</IfModule>

End of the if condition we started earlier.

Conditions

Some of you must have already seen a big problem in this approach. To see this problem, create a folder called, say, 'data' in your document root. Now create a file called 'something.txt' in this folder. Then try to access this folder from a browser using the URL.

http://localhost/data/

Now you see the problem, don't you? The above URL will result in the output...

Array
(
    [format] => html
    [user] => data
    [tags] => 
)

Our mod_rewrite rules have captured the URL of a valid file along with the other URLs. To solve this problem, we will use a feature of mod_rewrite called Conditions. Insert these lines in the .htaccess file.

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^\/]+)\/([^\/]*)$ show_bookmarks.php?format=html&user=$1&tags=$2
</IfModule>

See the line RewriteCond %{REQUEST_FILENAME} !-d? This will make sure that the requested file is not a directory. The line RewriteCond %{REQUEST_FILENAME} !-f will prevent files from being caught by the rewrite rule. The algorithm of these statements will look something like this...

if( 'Requested Filename' IS NOT Directory ) {
 if( 'Requested Filename' IS NOT File ) {
  Rewrite the URL.
 }
}

I hope you got the logic behind this - it took me a while to understand. Anyway, as I said earlier, don't be dissappointed if you don't get it at the first try - there is a lot of black magic involved.

Now try to access the file we created a little while back...

http://localhost/data/something.txt

You will see that it works perfectly(hopefully). Now try...

http://localhost/data/

Again the folder is being accessed. Now try a URL that must be re-written...

http://localhost/binblog/javascript+ajax

If all goes well, this URL will be caught by our system and will be redirected to the show_bookmarks.php file.

More about mod_rewrite in the next post(mod_rewrite Directives - RewriteCond and RewriteRule).

4 Comments:

Anonymous said...

More examples please :)

Binny V A said...

For more examples, see the post 'Practical Uses for mod_rewrite'

Anonymous said...

hi,
binny nice tutorial, i like it.
but when i try to run. this i am not geting out put. so can you tell me where to put this .htacces file?

Binny V A said...

Put the .htacess file in the same folder your web pages are in.