File Crawler PHP

2019-05-21 06:55发布

问题:

just wondering how it would be possible to recursively search through a website folder directory (the same one as the script is uploaded to) and open/read every file and search for a specific string?

for example I might have this:

search.php?string=hello%20world

this would run a process then output somethign like

"hello world found inside"

httpdocs
/index.php
/contact.php

httpdocs/private/
../prviate.php
../morestuff.php
../tastey.php

httpdocs/private/love
../../goodness.php

I dont want it to link- crawl as private files and unlinked files are round, but i'd like every other non-binary file to be access really.

many thanks

Owen

回答1:

Two immediate solutions come to mind.

1) Using grep with the exec command (only if the server supports it):

$query = $_GET['string'];
$found = array();
exec("grep -Ril '" . escapeshellarg($query) . "' " . $_SERVER['DOCUMENT_ROOT'], $found);

Once finished, every file-path that contains the query will be placed in $found. You can iterate through this array and process/display it as needed.

2) Recursively loop through the folder and open each file, search for the string, and save it if found:

function search($file, $query, &$found) {
    if (is_file($file)) {
        $contents = file_get_contents($file);
        if (strpos($contents, $query) !== false) {
            // file contains the query string
            $found[] = $file;
        }
    } else {
        // file is a directory
        $base_dir = $file;
        $dh = opendir($base_dir);
        while (($file = readdir($dh))) {
            if (($file != '.') && ($file != '..')) {
                // call search() on the found file/directory
                search($base_dir . '/' . $file, $query, $found);
            }
        }
        closedir($dh);
    }
}

$query = $_GET['string'];
$found = array();
search($_SERVER['DOCUMENT_ROOT'], $query, $found);

This should (untested) recursively search into each subfolder/file for the requested string. If it's found, it will be in the variable $found.



回答2:

if directory listing is turned on you can try

<?php
$dir = "http://www.blah.com/";
foreach(scandir($dir) as $file){
  print '<a href="'.$dir.$file.'">'.$file.'</a><br>';
}
?>

or

<?php
$dir = "http://www.blah.com/";
$dh  = opendir($dir);
while (false !== ($file = readdir($dh))) {
  print '<a href="'.$dir.$file.'">'.$file.'</a><br>';
}
?>


回答3:

If you cannot use any of the mentioned methods, you could use a recursive directory walk with a callback. And define your callback as a function which checks a given file for a given string.