Jump to page content

Removing “.php” from URLs in Apache

There is plenty of advice on the Web for how to map URLs to .php scripts. However, my specific use case seems to be a more obscure one: how to remove the .php extension from existing URLs. That is: pages on the site are still PHP scripts as before, but the “.php” extension is to be removed from URLs for cleanliness. All old URLs with “.php” must redirect to the new URLs without the extension, to avoid creating duplicate pages in search engines. The solution to this problem is a little bit more complex than it seems. If you are using .htaccess files under shared hosting, the following note from the mod_rewrite flags documentation is important:

If you are using RewriteRule in either .htaccess files or in <Directory> sections, it is important to have some understanding of how the rules are processed. The simplified form of this is that once the rules have been processed, the rewritten request is handed back to the URL parsing engine to do what it may with it. It is possible that as the rewritten request is handled, the .htaccess file or <Directory> section may be encountered again, and thus the ruleset may be run again from the start. Most commonly this will happen if one of the rules causes a redirect - either internal or external - causing the request process to start over.

The simplest approach—avoiding the complexity of mod_rewrite—would appear to be to use RedirectMatch to redirect the old URLs and MultiViews to automatically find the matching PHP script, e.g. for the existing URL /foo.php, the following would apply:

  1. User-agent asks for /foo.php: RedirectMatch issues a redirect to /foo
  2. User-agent asks for /foo: MultiViews internally maps foo back to foo.php

On their own, each step will work. However, when combined, MultiViews is processed first, and this causes the internally-mapped URL /foo.php—that the user-agent did not request on the second pass—to be redirected back to /foo. This creates a redirect loop where, for the user-agent, /foo redirects to /foo endlessly.

The only way to stop the internal look-up being picked up by the redirect is to use mod_rewrite, where you have more control. The [last]/[L] flag sounds like it should suffice for this, but it does not: even when it performs an internal redirect, the new URL is itself subjected to rewriting, creating the extremely confusing scenario where %{REQUEST_URI} now contains the internally-resolved URL and not the actual URL that the user-agent submitted! Apache does not have an equivalent to %{REQUEST_URI} that contains what the user-agent really requested, but the [END] flag—available since Apache 2.3.9—is an industrial-strength version of [last] that actually works. Since MultiViews has no such option, it is ruled out.

Likewise, RedirectMatch cannot be used, because it does not allow [END] to be triggered. Consequently, mod_rewrite must be used for the entire process. The complete set of Apache instructions is as follows:

# Ensure MultiViews is disabled, because it takes precedence and causes problems
Options -MultiViews

# Use mod_rewrite exclusively because it is the only way to get enough control over Apache
RewriteEngine On
RewriteBase /

# This part replaces MultiViews and maps the external URL /foo to the internal URL /foo.php
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}.php -f
# Here, we MUST use “END”, otherwise the request is still passed back into mod_rewrite, with REQUEST_URI
# updated to contain the very “.php” we removed with the original redirect, creating a redirect loop;
# [L] does not work here: it only stops rules in the current pass, but a whole new pass is triggered that cannot be
# detected in any way
RewriteRule (.*) $1.php [END]

# Here, we redirect old URLs to the new URLs; this cannot be RedirectMatch because then the [END] in step 2
# does not get applied
RewriteCond %{REQUEST_URI} \.php$
RewriteRule ^(.*)\.php $1 [R=301]