Portable and safe way to get PATH_INFO

2020-03-02 04:05发布

问题:

I'm seeking a portable way to receive the (handy) $_SERVER['PATH_INFO'] variable.

After reading a while, it turns out PATH_INFO is originated from CGI/1.1, and my not always be present in all configuration.

What is the best (mostly security-wise) way to get that variable - apart from extracting it manually (security concern).

回答1:

Well, I'm (almost) sure that without making use of the $_SERVER superglobal keys, providing a alternative way to figure out PATH_INFO is just impossible, that being said lets first list all of the $_SERVER keys that we may possibly use:

  • 'PHP_SELF'
  • 'QUERY_STRING'
  • 'SCRIPT_FILENAME'
  • 'PATH_TRANSLATED'
  • 'SCRIPT_NAME'
  • 'REQUEST_URI'
  • 'PATH_INFO'
  • 'ORIG_PATH_INFO'

We obviously need to ignore the last two. Now we should (I don't know this for a fact, I'm just assuming because you said so) filter all the keys that exist in the link you provided (which BTW is offline ATM), that leaves us with the following keys:

  • 'PHP_SELF'
  • 'SCRIPT_FILENAME'
  • 'REQUEST_URI'

Regarding your comment to Anthonys answer:

You are just juggling variables now. SCRIPT_FILENAME is a part of the CGI spec. It will not be available if PATH_INFO is unavailable. As for REQUEST_URI, it's apache's mod_rewrite specific. – LiraNuna

I'm running LightTPD/1.4.20-1 (Win32) with PHP 5.3.0 as CGI, cgi.fix_pathinfo = 1 and $_SERVER['REQUEST_URI'] is very available to me, I also remember using that same variable back in the days when no one used mod_rewrite so my honest humble guess is that you're plain wrong in this point. Regarding the SCRIPT_FILENAME key I'm unable to test that one out ATM. Still, if we close our eyes really hard and believe that you're right that leaves us with only one variable:

  • 'PHP_SELF'

I'm not trying in being harsh here (and I still believe that there are more solutions) but if PHP_SELF is the only key you want us to work with (assuming there are no impositions on PHP_SELF itself) there is only one solution left:

function PATH_INFO()
{
 if (array_key_exists('PATH_INFO', $_SERVER) === true)
 {
  return $_SERVER['PATH_INFO'];
 }

 $whatToUse = basename(__FILE__); // see below

 return substr($_SERVER['PHP_SELF'], strpos($_SERVER['PHP_SELF'], $whatToUse) + strlen($whatToUse));
}

This function should work, however there may be some problems using the __FILE__ constant since it returns the path to the file where the __FILE__ constant is declared and not the path to the requested PHP script, so that's why the $whatToUse is there for: sou you can replace it with 'SCRIPT_FILENAME' or if you really believe in what you are saying, just use '.php'.

You should also read this regarding why not to use PHP_SELF.

If this doesn't work for you, I'm sorry but I can think of anything else.

EDIT - Some more reading for you:

  • Drupal request_uri() (why do they keep saying REQUEST_URI is Apache specific?)
  • PHP_SELF vs PATH_INFO vs SCRIPT_NAME vs REQUEST_URI


回答2:

I think here is a trick to get "path_info" in other way:

$path_info = str_replace($_SERVER['SCRIPT_NAME'], '', $_SERVER['PHP_SELF']);

For example, access to a URL like: http://somehost.com/index.php/some/path/here, the value of $path_info would be: "/some/path/here"

It worked for me in various apache servers running on windows and linux, but I'm not 100% sure if it's "safe" and "portable", ovbiously I don't test it in "ALL" servers configs, but appears to work...



回答3:

function getPathInfo() {
    if (isset($_SERVER['PATH_INFO'])) {
        return $_SERVER['PATH_INFO'];
    }  
    $scriptname = preg_quote($_SERVER["SCRIPT_NAME"], '/');
    $pathinfo = preg_replace("/^$scriptname/", "", $_SERVER["PHP_SELF"]);
    return $pathinfo;
}

Edit: without SCRIPT_NAME, and assuming you have DOCUMENT_ROOT (or can define/discover it yourself) and assuming you have SCRIPT_FILENAME, then:

function getPathInfo() {
    if (isset($_SERVER['PATH_INFO'])) {
        return $_SERVER['PATH_INFO'];
    }  
    $docroot = preg_quote($_SERVER["DOCUMENT_ROOT"], "/");
    $scriptname = preg_replace("/^$docroot/", "", $_SERVER["SCRIPT_FILENAME"]);
    $scriptname = preg_quote($scriptname, "/");
    $pathinfo = preg_replace("/^$scriptname/", "", $_SERVER["PHP_SELF"]);
    return $pathinfo;
}

Also @ Anthony (not enough rep to comment, sorry): Using str_replace() will match anywhere in the string. It's not guaranteed to work, you want to only match it at the start. Also, your method of only going 1 slash back (via strrpos) to determine SCRIPT_NAME, will only work if the script is under the root, which is why you're better off diffing script_filename against docroot.



回答4:

It depends on the definitions for "portable" and "safe".

Let me see if I understood:

1) You are not interested on CLI:

  • you mentioned PHP/CGI
  • PATH_INFO is a piece of an URL; so, it only makes sense to discuss PATH_INFO when the script is accessed from a URL (i.e. from an HTTP connection, usually requested by a browser)

2) You want to have PATH_INFO in all OS + HTTP server + PHP combination:

  • OS may be Windows, Linux, etc
  • HTTP server may be Apache 1, Apache 2, NginX, Lighttpd, etc.
  • PHP may be version 4, 5, 6 or any version

Hmmm... PHP_INFO, in the $_SERVER array, is provided by PHP to a script in execution only under certain conditions, depending on the softwares mentioned above. It is not always available. The same is true for the entire $_SERVER array!

In short: "$_SERVER depends on the server"... so a portable solution can't relay on $_SERVER... (just to give one example: we have a tutorial to set up PHP/CGI $_SERVER variables on NginX HTTP server at kbeezie.com/view/php-self-path-nginx/)

3) Despite what was mentioned above, it worths mentioning that if we somehow have the full URL that was requested available as a string, it is possible to obtain the PATH_INFO from it by applying regular expressions and other PHP string functions, safely (also validating the input string as a valid URI).

So, provided that we have the URL string... then YES, WE HAVE a portable and safe way to determine PATH_INFO from it.


Now, we have two clear and focused implementation issues:

  1. How to obtain the URL?
  2. How to obtain the PATH_INFO from the URL?

Among several possibilities, here is a possible approach:

How to obtain the URL?

1) With your deep and comprehensive knowledge about each HTTP server + OS + PHP version combination, check and try each possibility to obtain the URL from the $_SERVER array (verify 'PHP_SELF', 'QUERY_STRING', 'SCRIPT_FILENAME', 'PATH_TRANSLATED', 'SCRIPT_NAME', 'REQUEST_URI', 'PATH_INFO', 'ORIG_PATH_INFO', 'HTTP_HOST', 'DOCUMENT_ROOT' or whatever)

2) If previous step failed, make the PHP script return a javascript code that sends "document.URL" information back. (The portability issue transfered to client-side.)

How to obtain the PATH_INFO from the URL?

This code linked here does this.

This is my humble opinion and approach to the problem.

What do you think?



回答5:

I didn't see the comments or the link before posting. Here is something that might work, based on what the page referenced above gives as CGI-derived variables:

function getPathInfo() {
    if (isset($_SERVER['PATH_INFO'])) {
        return $_SERVER['PATH_INFO'];
    }  

    $script_filename = $_SERVER["SCRIPT_FILENAME"];
    $script_name_start = strrpos($script_filename, "/");
    $script_name = substr($script_filename, $script_name_start);

    //With the above you should have the plain file name of script without path        

    $script_uri = $_SERVER["REQUEST_URI"];
    $script_name_length = strlen($script_name);
    $path_start = $script_name_length + strpos($script_name, $script_uri);

    //You now have the position of where the script name ends in REQUEST_URI

    $pathinfo = substr($script_uri, $path_start);
    return $pathinfo;
}


回答6:

you could try

$_ENV['PATH_INFO']; or
getenv('PATH_INFO']; 


标签: php pathinfo