I'm trying to enforce a root directory in a filesystem abstraction. The problem I'm encountering is the following:
The API lets you read and write files, not only to local but also remote storages. So there's all kinds of normalisation going on under the hood. At the moment it doesn't support relative paths, so something like this isn't possible:
$filesystem->write('path/to/some/../relative/file.txt', 'file contents');
I want to be able to securely resolve the path so the output is would be: path/to/relative/file.txt
.
As is stated in a github issue which was created for this bug/enhancement (https://github.com/FrenkyNet/Flysystem/issues/36#issuecomment-30319406) , it needs to do more that just splitting up segments and removing them accordingly.
Also, since the package handles remote filesystems and non-existing files, realpath is out of the question.
So, how should one go about when dealing with these paths?
To quote Jame Zawinski:
Some people, when confronted with a problem, think "I know, I'll use regular expressions."
Now they have two problems.
protected function getAbsoluteFilename($filename) {
$path = [];
foreach(explode('/', $filename) as $part) {
// ignore parts that have no value
if (empty($part) || $part === '.') continue;
if ($part !== '..') {
// cool, we found a new part
array_push($path, $part);
}
else if (count($path) > 0) {
// going back up? sure
array_pop($path);
} else {
// now, here we don't like
throw new \Exception('Climbing above the root is not permitted.');
}
}
// prepend my root directory
array_unshift($path, $this->getPath());
return join('/', $path);
}
I've resolved how to do this, this is my solution:
/**
* Normalize path
*
* @param string $path
* @param string $separator
* @return string normalized path
*/
public function normalizePath($path, $separator = '\\/')
{
// Remove any kind of funky unicode whitespace
$normalized = preg_replace('#\p{C}+|^\./#u', '', $path);
// Path remove self referring paths ("/./").
$normalized = preg_replace('#/\.(?=/)|^\./|\./$#', '', $normalized);
// Regex for resolving relative paths
$regex = '#\/*[^/\.]+/\.\.#Uu';
while (preg_match($regex, $normalized)) {
$normalized = preg_replace($regex, '', $normalized);
}
if (preg_match('#/\.{2}|\.{2}/#', $normalized)) {
throw new LogicException('Path is outside of the defined root, path: [' . $path . '], resolved: [' . $normalized . ']');
}
return trim($normalized, $separator);
}
./ current location
../ one level up
function normalize_path($str){
$N = 0;
$A =explode("/",preg_replace("/\/\.\//",'/',$str)); // remove current_location
$B=[];
for($i = sizeof($A)-1;$i>=0;--$i){
if(trim($A[$i]) ===".."){
$N++;
}else{
if($N>0){
$N--;
}
else{
$B[] = $A[$i];
}
}
}
return implode("/",array_reverse($B));
}
so:
"a/b/c/../../d" -> "a/d"
"a/./b" -> "a/b"
/**
* Remove '.' and '..' path parts and make path absolute without
* resolving symlinks.
*
* Examples:
*
* resolvePath("test/./me/../now/", false);
* => test/now
*
* resolvePath("test///.///me///../now/", true);
* => /home/example/test/now
*
* resolvePath("test/./me/../now/", "/www/example.com");
* => /www/example.com/test/now
*
* resolvePath("/test/./me/../now/", "/www/example.com");
* => /test/now
*
* @access public
* @param string $path
* @param mixed $basePath resolve paths realtively to this path. Params:
* STRING: prefix with this path;
* TRUE: use current dir;
* FALSE: keep relative (default)
* @return string resolved path
*/
function resolvePath($path, $basePath=false) {
// Make absolute path
if (substr($path, 0, 1) !== DIRECTORY_SEPARATOR) {
if ($basePath === true) {
// Get PWD first to avoid getcwd() resolving symlinks if in symlinked folder
$path=(getenv('PWD') ?: getcwd()).DIRECTORY_SEPARATOR.$path;
} elseif (strlen($basePath)) {
$path=$basePath.DIRECTORY_SEPARATOR.$path;
}
}
// Resolve '.' and '..'
$components=array();
foreach(explode(DIRECTORY_SEPARATOR, rtrim($path, DIRECTORY_SEPARATOR)) as $name) {
if ($name === '..') {
array_pop($components);
} elseif ($name !== '.' && !(count($components) && $name === '')) {
// … && !(count($components) && $name === '') - we want to keep initial '/' for abs paths
$components[]=$name;
}
}
return implode(DIRECTORY_SEPARATOR, $components);
}