In order to remove index.html
or index.htm
from urls I use the following in my .htaccess
RewriteCond %{REQUEST_URI} /index\.html?$ [NC]
RewriteRule ^(.*)index\.html?$ "/$1" [NC,R=301,NE,L]
This works! (More info about flags at the end of this question *)
Then in order to add www
in urls I use the following in my .htaccess
RewriteCond %{HTTP_HOST} !^www\.mydomain\.com$ [NC]
RewriteRule ^(.*)$ "http://www.mydomain.com/$1" [R=301,NE,L]
This works too!
The question here is how to avoid the double redirection created by rules above in cases like the one below:
- browsers asks for
http://mydomain.com/path/index.html
- server sends
301
header to redircet browser tohttp://mydomain.com/path/
- then browser requests
http://mydomain.com/path/
- now the server sends
301
header to redircet browser tohttp://www.mydomain.com/path/
This is obviously not very smart cause a poor user who is asking http://mydomain.com/path/index.html
would be double redirected, and he would feel page goes too slow. Moreover Googlebot might stop following the link cause to the double redircetion (I'm not sure on this last one and I don't want to get into a discussion on this, it's just another possible issue.)
Thanks!
*To whom it might be interested:
NC
is used to redirect also uppercased files i.e.INDEX.HTML
/InDeX.HtM
NE
is used to avoid double url encoding I avoidhttp://.../index.html?hello=ba%20be
to be redirected tohttp://.../index.html?hello=ba%2520be
(not needed thanks to anubhava answer)QSA
is used to redirect also queries, i.e.http://.../index.html?hello=babe
tohttp://.../?hello=babe
To avoid double redirection have another rule in .htaccess file that meets both conditions like this:
So if input URL is
http://mydomain.com/path/index.html
then both the conditions get satisfied in the first rule here and there will be 1 single redirect (301) tohttp://www.mydomain.com/path/
.Also I believe
QSA
flag is not really needed above since you are NOT manipulating query string.Remove the
L
flag from the prior rule?L
forces the rule parsing to stop (when the rule is matched) and thus send the first rewritten URL without applying the second rule.The rules are applied sequentially from top to bottom, each rewriting the URL again if it matches the rule's conditions and pattern.
Hence the above will first add the
www
and then remove theindex.html?
, before sending the new URL; A single redirect for all the rules.A better solution would be to place the index.html rule ahead of the www rule and inside the index.html rule ADD the www prefix to the destination url. This way someone looking for http://domain.com/index.html would get sent to http://www.domain.com/ by the FIRST rule. The second (www) rule would then only apply if index AND www are missing, which is again only one redirect.