Tips for debugging .htaccess rewrite rules

2018-12-31 00:33发布

Many posters have problems debugging their RewriteRule and RewriteCond statements within their .htaccess files. Most of these are using a shared hosting service and therefore don't have access to the root server configuration. They cannot avoid using .htaccess files for rewriting and cannot enable a RewriteLogLevel" as many respondents suggest. Also there are many .htaccess-specific pitfalls and constraints are aren't covered well. Setting up a local test LAMP stack involves too much of a learning curve for most.

So my Q here is how would we recommend that they debug their rules themselves. I provide a few suggestions below. Other suggestions would be appreciated.

  1. Understand that the mod_rewrite engine cycles through .htaccess files. The engine runs this loop:

    do
      execute server and vhost rewrites (in the Apache Virtual Host Config)
      find the lowest "Per Dir" .htaccess file on the file path with rewrites enabled
      if found(.htaccess)
         execute .htaccess rewrites (in the user's directory)
    while rewrite occurred
    

    So your rules will get executed repeatedly and if you change the URI path then it may end up executing other .htaccessfiles if they exist. So make sure that you terminate this loop, if necessary by adding extra RewriteCond to stop rules firing. Also delete any lower level .htaccess rewrite rulesets unless explicitly intent to use multi-level rulesets.

  2. Make sure that the syntax of each Regexp is correct by testing against a set of test patterns to make sure that is a valid syntax and does what you intend with a fully range of test URIs. See answer below for more details.

  3. Build up your rules incrementally in a test directory. You can make use of the "execute the deepest .htaccess file on the path feature" to set up a separate test directory (tree) and debug rulesets here without screwing up your main rules and stopping your site working. You have to add them one at a time because this is the only way to localise failures to individual rules.

  4. Use a dummy script stub to dump out server and environment variables. (See Listing 2)If your app uses, say, blog/index.php then you can copy this into test/blog/index.php and use it to test out your blog rules in the test subdirectory. You can also use environment variables to make sure that the rewrite engine in interpreting substitution strings correctly, e.g.

    RewriteRule ^(.*) - [E=TEST0:%{DOCUMENT_ROOT}/blog/html_cache/$1.html]
    

    and look for these REDIRECT_* variables in the phpinfo dump. BTW, I used this one and discovered on my site that I had to use %{ENV:DOCUMENT_ROOT_REAL} instead. In the case of redirector looping REDIRECT_REDIRECT_* variables list the previous pass. Etc..

  5. Make sure that you don't get bitten by your browser caching incorrect 301 redirects. See answer below. My thanks to Ulrich Palha for this.

  6. The rewrite engine seems sensitive to cascaded rules within an .htaccess context, (that is where a RewriteRule results in a substitution and this falls though to further rules), as I found bugs with internal sub-requests (1), and incorrect PATH_INFO processing which can often be prevents by use of the [NS], [L] and [PT] flags.

Any more comment or suggestions?

Listing 1 -- phpinfo

<?php phpinfo(INFO_ENVIRONMENT|INFO_VARIABLES);

14条回答
梦醉为红颜
2楼-- · 2018-12-31 01:04

Some mistakes I observed happens when writing .htaccess

Using of ^(.*)$ repetitively in multiple rules, using ^(.*)$ causes other rules to be impotent in most cases, because it matches all of the url in single hit.

So, if we are using rule for this url sapmle/url it will also consume this url sapmle/url/string.


[L] flag should be used to ensure our rule has done processing.


Should know about:

Difference in %n and $n

%n is matched during %{RewriteCond} part and $n is matches on %{RewriteRule} part.

Working of RewriteBase

The RewriteBase directive specifies the URL prefix to be used for per-directory (htaccess) RewriteRule directives that substitute a relative path.

This directive is required when you use a relative path in a substitution in per-directory (htaccess) context unless any of the following conditions are true:

The original request, and the substitution, are underneath the DocumentRoot (as opposed to reachable by other means, such as Alias). The filesystem path to the directory containing the RewriteRule, suffixed by the relative substitution is also valid as a URL path on the server (this is rare). In Apache HTTP Server 2.4.16 and later, this directive may be omitted when the request is mapped via Alias or mod_userdir.

查看更多
何处买醉
3楼-- · 2018-12-31 01:07

I found this question while trying to debug my mod_rewrite issues, and it definitely has some helpful advice. But in the end the most important thing is to make sure you have your regex syntax correct. Due to problems with my own RE syntax, installing the regexpCheck.php script was not a viable option.

But since Apache uses Perl-Compatible Regular Expressions (PCRE)s, any tool which helps writing PCREs should help. I've used RegexPlanet's tool with Java and Javascript REs in the past, and was happy to find that they support Perl as well.

Just type in your regular expression and one or more example URLs, and it will tell you if the regex matches (a "1" in the "~=" column) and if applicable, any matching groups (the numbers in the "split" column will correspond to the numbers Apache expects, e.g. $1, $2 etc.) for each URL. They claim PCRE support is "in beta", but it was just what I needed to solve my syntax problems.

http://www.regexplanet.com/advanced/perl/index.html

I'd have simply added a comment to an existing answer but my reputation isn't yet at that level. Hope this helps someone.

查看更多
登录 后发表回答