I would have thought this would be better documented somewhere, but cannot find much information on the subject.
Basically I'm using htaccess to instill 3 rules on the site i'm working on:
- Redirect / rewrite non-www to www
- Remove the extensions from each of the site pages - they're php files. Doing this means that the site index becomes www.example.co.uk/index instead of www.example.co.uk/index.php, so...
- Redirect / rewrite the www.example.co.uk/index to www.example.co.uk/
This is the script I've compiled from various sources, it does work, but google doesn't seem to be crawling the site when i point to the extensionless urls in the sitemap, any idea why? Thanks in advance.
Options +FollowSymlinks
RewriteEngine On
# Rewrite index.php to /
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/#?\ ]+/)*index\.php?\ HTTP/
RewriteCond %{HTTP_HOST} ^(www\.example\.co\.uk) [OR]
RewriteCond www.%{HTTP_HOST} ^(www\.example\.co\.uk)
RewriteRule ^(([^/]+/)*)index\.php?$ http://%1/$1 [R=301,L]
# Rewrite example.co.uk to www.example.co.uk for canonic purposes, this rule is paired with the previous
RewriteCond %{HTTP_HOST} ^example\.co\.uk [NC]
RewriteRule ^(.*)$ http://www.example.co.uk/$1 [R=301,L]
#REMOVE .php from file extensions
# If the requested URI does not contain a period in the final path-part
RewriteCond %{REQUEST_URI} !(\.[^./]+)$
# and if it does not exist as a directory
RewriteCond %{REQUEST_fileNAME} !-d
# and if it does not exist as a file
RewriteCond %{REQUEST_fileNAME} !-f
# then add .php to get the actual filename
RewriteRule (.*) /$1.php [L]
# If client request header contains php file extension
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^.]+\.)+php\ HTTP
# externally redirect to extensionless URI
RewriteRule ^(.+)\.php$ http://www.example.co.uk/$1 [R=301,L]