可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
Any ideas on an async directory search using fs.readdir? I realise that we could introduce recursion and call the read directory function with the next directory to read, but am a little worried about it not being async...
Any ideas? I\'ve looked at node-walk which is great, but doesn\'t give me just the files in an array, like readdir does. Although
Looking for output like...
[\'file1.txt\', \'file2.txt\', \'dir/file3.txt\']
回答1:
There are basically two ways of accomplishing this. In an async environment you\'ll notice that there are two kinds of loops: serial and parallel. A serial loop waits for one iteration to complete before it moves onto the next iteration - this guarantees that every iteration of the loop completes in order. In a parallel loop, all the iterations are started at the same time, and one may complete before another, however, it is much faster than a serial loop. So in this case, it\'s probably better to use a parallel loop because it doesn\'t matter what order the walk completes in, just as long as it completes and returns the results (unless you want them in order).
A parallel loop would look like this:
var fs = require(\'fs\');
var path = require(\'path\');
var walk = function(dir, done) {
var results = [];
fs.readdir(dir, function(err, list) {
if (err) return done(err);
var pending = list.length;
if (!pending) return done(null, results);
list.forEach(function(file) {
file = path.resolve(dir, file);
fs.stat(file, function(err, stat) {
if (stat && stat.isDirectory()) {
walk(file, function(err, res) {
results = results.concat(res);
if (!--pending) done(null, results);
});
} else {
results.push(file);
if (!--pending) done(null, results);
}
});
});
});
};
A serial loop would look like this:
var fs = require(\'fs\');
var walk = function(dir, done) {
var results = [];
fs.readdir(dir, function(err, list) {
if (err) return done(err);
var i = 0;
(function next() {
var file = list[i++];
if (!file) return done(null, results);
file = dir + \'/\' + file;
fs.stat(file, function(err, stat) {
if (stat && stat.isDirectory()) {
walk(file, function(err, res) {
results = results.concat(res);
next();
});
} else {
results.push(file);
next();
}
});
})();
});
};
And to test it out on your home directory (WARNING: the results list will be huge if you have a lot of stuff in your home directory):
walk(process.env.HOME, function(err, results) {
if (err) throw err;
console.log(results);
});
EDIT: Improved examples.
回答2:
A. Have a look at the file module. It has a function called walk:
file.walk(start, callback)
Navigates a file tree, calling callback for each directory, passing in
(null, dirPath, dirs, files).
This may be for you! And yes, it is async. However, I think you would have to aggregate the full path\'s yourself, if you needed them.
B. An alternative, and even one of my favourites: use the unix find
for that. Why do something again, that has already been programmed? Maybe not exactly what you need, but still worth checking out:
var execFile = require(\'child_process\').execFile;
execFile(\'find\', [ \'somepath/\' ], function(err, stdout, stderr) {
var file_list = stdout.split(\'\\n\');
/* now you\'ve got a list with full path file names */
});
Find has a nice build-in caching mechanism that makes subsequent searches very fast, as long as only few folder have changed.
回答3:
Just in case anyone finds it useful, I also put together a synchronous version.
var walk = function(dir) {
var results = [];
var list = fs.readdirSync(dir);
list.forEach(function(file) {
file = dir + \'/\' + file;
var stat = fs.statSync(file);
if (stat && stat.isDirectory()) {
/* Recurse into a subdirectory */
results = results.concat(walk(file));
} else {
/* Is a file */
results.push(file);
}
});
return results;
}
Tip: To use less resources when filtering. Filter within this function itself. E.g. Replace results.push(file);
with below code. Adjust as required:
file_type = file.split(\".\").pop();
file_name = file.split(/(\\\\|\\/)/g).pop();
if (file_type == \"json\") results.push(file);
回答4:
Another nice npm package is glob.
npm install glob
It is very powerful and should cover all your recursing needs.
Edit:
I actually wasn\'t perfectly happy with glob, so I created readdirp.
I\'m very confident that its API makes finding files and directories recursively and applying specific filters very easy.
Read through its documentation to get a better idea of what it does and install via:
npm install readdirp
回答5:
This one uses the maximum amount of new, buzzwordy features available in node 8, including Promises, util/promisify, destructuring, async-await, map+reduce and more, making your co-workers scratch their heads as they try to figure out what is going on.
Node 8+
No external dependencies.
const { promisify } = require(\'util\');
const { resolve } = require(\'path\');
const fs = require(\'fs\');
const readdir = promisify(fs.readdir);
const stat = promisify(fs.stat);
async function getFiles(dir) {
const subdirs = await readdir(dir);
const files = await Promise.all(subdirs.map(async (subdir) => {
const res = resolve(dir, subdir);
return (await stat(res)).isDirectory() ? getFiles(res) : res;
}));
return files.reduce((a, f) => a.concat(f), []);
}
Usage:
getFiles(__dirname)
.then(files => console.log(files))
.catch(e => console.error(e));
Node 10+
Updated for node 10+ with even more whizbang:
const { resolve } = require(\'path\');
const { readdir, stat } = require(\'fs\').promises;
async function getFiles(dir) {
const subdirs = await readdir(dir);
const files = await Promise.all(subdirs.map(async (subdir) => {
const res = resolve(dir, subdir);
return (await stat(res)).isDirectory() ? getFiles(res) : res;
}));
return Array.prototype.concat(...files);
}
Node 11+
If you want to blow everybody\'s head up completely, you can use the following version using async iterators. In addition to being really cool, it also allows consumers to pull out results one-at-a-time, making it better suited for really large directories.
const { resolve } = require(\'path\');
const { readdir, stat } = require(\'fs\').promises;
async function* getFiles(dir) {
const subdirs = await readdir(dir);
for (const subdir of subdirs) {
const res = resolve(dir, subdir);
if ((await stat(res)).isDirectory()) {
yield* getFiles(res);
} else {
yield res;
}
}
}
Usage has changed because the return type is now an async iterator instead of a promise
(async () => {
for await (const f of getFiles(\'.\')) {
console.log(f);
}
})()
回答6:
I recommend using node-glob to accomplish that task.
var glob = require( \'glob\' );
glob( \'dirname/**/*.js\', function( err, files ) {
console.log( files );
});
回答7:
If you want to use an npm package, wrench is pretty good.
var wrench = require(\"wrench\");
var files = wrench.readdirSyncRecursive(\"directory\");
wrench.readdirRecursive(\"directory\", function (error, files) {
// live your dreams
});
EDIT (2018):
Anyone reading through in recent time: The author deprecated this package in 2015:
wrench.js is deprecated, and hasn\'t been updated in quite some time. I heavily recommend using fs-extra to do any extra filesystem operations.
回答8:
I loved the answer from chjj above and would not have been able to create my version of the parallel loop without that start.
var fs = require(\"fs\");
var tree = function(dir, done) {
var results = {
\"path\": dir
,\"children\": []
};
fs.readdir(dir, function(err, list) {
if (err) { return done(err); }
var pending = list.length;
if (!pending) { return done(null, results); }
list.forEach(function(file) {
fs.stat(dir + \'/\' + file, function(err, stat) {
if (stat && stat.isDirectory()) {
tree(dir + \'/\' + file, function(err, res) {
results.children.push(res);
if (!--pending){ done(null, results); }
});
} else {
results.children.push({\"path\": dir + \"/\" + file});
if (!--pending) { done(null, results); }
}
});
});
});
};
module.exports = tree;
I created a Gist as well. Comments welcome. I am still starting out in the NodeJS realm so that is one way I hope to learn more.
回答9:
Use node-dir to produce exactly the output you like
var dir = require(\'node-dir\');
dir.files(__dirname, function(err, files) {
if (err) throw err;
console.log(files);
//we have an array of files now, so now we can iterate that array
files.forEach(function(path) {
action(null, path);
})
});
回答10:
With Recursion
var fs = require(\'fs\')
var path = process.cwd()
var files = []
var getFiles = function(path, files){
fs.readdirSync(path).forEach(function(file){
var subpath = path + \'/\' + file;
if(fs.lstatSync(subpath).isDirectory()){
getFiles(subpath, files);
} else {
files.push(path + \'/\' + file);
}
});
}
Calling
getFiles(path, files)
console.log(files) // will log all files in directory
回答11:
I\'ve coded this recently, and thought it would make sense to share this here. The code makes use of the async library.
var fs = require(\'fs\');
var async = require(\'async\');
var scan = function(dir, suffix, callback) {
fs.readdir(dir, function(err, files) {
var returnFiles = [];
async.each(files, function(file, next) {
var filePath = dir + \'/\' + file;
fs.stat(filePath, function(err, stat) {
if (err) {
return next(err);
}
if (stat.isDirectory()) {
scan(filePath, suffix, function(err, results) {
if (err) {
return next(err);
}
returnFiles = returnFiles.concat(results);
next();
})
}
else if (stat.isFile()) {
if (file.indexOf(suffix, file.length - suffix.length) !== -1) {
returnFiles.push(filePath);
}
next();
}
});
}, function(err) {
callback(err, returnFiles);
});
});
};
You can use it like this:
scan(\'/some/dir\', \'.ext\', function(err, files) {
// Do something with files that ends in \'.ext\'.
console.log(files);
});
回答12:
Using async/await, this should work:
const FS = require(\'fs\');
const readDir = promisify(FS.readdir);
const fileStat = promisify(FS.stat);
async function getFiles(dir) {
let files = await readDir(dir);
let result = files.map(file => {
let path = Path.join(dir,file);
return fileStat(path).then(stat => stat.isDirectory() ? getFiles(path) : path);
});
return flatten(await Promise.all(result));
}
function flatten(arr) {
return Array.prototype.concat(...arr);
}
You can use bluebird.Promisify or this:
/**
* Returns a function that will wrap the given `nodeFunction`. Instead of taking a callback, the returned function will return a promise whose fate is decided by the callback behavior of the given node function. The node function should conform to node.js convention of accepting a callback as last argument and calling that callback with error as the first argument and success value on the second argument.
*
* @param {Function} nodeFunction
* @returns {Function}
*/
module.exports = function promisify(nodeFunction) {
return function(...args) {
return new Promise((resolve, reject) => {
nodeFunction.call(this, ...args, (err, data) => {
if(err) {
reject(err);
} else {
resolve(data);
}
})
});
};
};
Node 8+ has Promisify built-in
回答13:
Check out the final-fs library. It provides a readdirRecursive
function:
ffs.readdirRecursive(dirPath, true, \'my/initial/path\')
.then(function (files) {
// in the `files` variable you\'ve got all the files
})
.otherwise(function (err) {
// something went wrong
});
回答14:
A library called Filehound is another option. It will recursively search a given directory (working directory by default). It supports various filters, callbacks, promises and sync searches.
For example, search the current working directory for all files (using callbacks):
const Filehound = require(\'filehound\');
Filehound.create()
.find((err, files) => {
if (err) {
return console.error(`error: ${err}`);
}
console.log(files); // array of files
});
Or promises and specifying a specific directory:
const Filehound = require(\'filehound\');
Filehound.create()
.paths(\"/tmp\")
.find()
.each(console.log);
Consult the docs for further use cases and examples of usage: https://github.com/nspragg/filehound
Disclaimer: I\'m the author.
回答15:
Standalone promise implementation
I am using the when.js promise library in this example.
var fs = require(\'fs\')
, path = require(\'path\')
, when = require(\'when\')
, nodefn = require(\'when/node/function\');
function walk (directory, includeDir) {
var results = [];
return when.map(nodefn.call(fs.readdir, directory), function(file) {
file = path.join(directory, file);
return nodefn.call(fs.stat, file).then(function(stat) {
if (stat.isFile()) { return results.push(file); }
if (includeDir) { results.push(file + path.sep); }
return walk(file, includeDir).then(function(filesInDir) {
results = results.concat(filesInDir);
});
});
}).then(function() {
return results;
});
};
walk(__dirname).then(function(files) {
console.log(files);
}).otherwise(function(error) {
console.error(error.stack || error);
});
I\'ve included an optional parameter includeDir
which will include directories in the file listing if set to true
.
回答16:
klaw and klaw-sync are worth considering for this sort of thing. These were part of node-fs-extra.
回答17:
Here\'s yet another implementation. None of the above solutions have any limiters, and so if your directory structure is large, they\'re all going to thrash and eventually run out of resources.
var async = require(\'async\');
var fs = require(\'fs\');
var resolve = require(\'path\').resolve;
var scan = function(path, concurrency, callback) {
var list = [];
var walker = async.queue(function(path, callback) {
fs.stat(path, function(err, stats) {
if (err) {
return callback(err);
} else {
if (stats.isDirectory()) {
fs.readdir(path, function(err, files) {
if (err) {
callback(err);
} else {
for (var i = 0; i < files.length; i++) {
walker.push(resolve(path, files[i]));
}
callback();
}
});
} else {
list.push(path);
callback();
}
}
});
}, concurrency);
walker.push(path);
walker.drain = function() {
callback(list);
}
};
Using a concurrency of 50 works pretty well, and is almost as fast as simpler implementations for small directory structures.
回答18:
The recursive-readdir module has this functionality.
回答19:
I modified Trevor Senior\'s Promise based answer to work with Bluebird
var fs = require(\'fs\'),
path = require(\'path\'),
Promise = require(\'bluebird\');
var readdirAsync = Promise.promisify(fs.readdir);
var statAsync = Promise.promisify(fs.stat);
function walkFiles (directory) {
var results = [];
return readdirAsync(directory).map(function(file) {
file = path.join(directory, file);
return statAsync(file).then(function(stat) {
if (stat.isFile()) {
return results.push(file);
}
return walkFiles(file).then(function(filesInDir) {
results = results.concat(filesInDir);
});
});
}).then(function() {
return results;
});
}
//use
walkDir(__dirname).then(function(files) {
console.log(files);
}).catch(function(e) {
console.error(e); {
});
回答20:
For fun, here is a flow based version that works with highland.js streams library. It was co-authored by Victor Vu.
###
directory >---m------> dirFilesStream >---------o----> out
| |
| |
+--------< returnPipe <-----------+
legend: (m)erge (o)bserve
+ directory has the initial file
+ dirListStream does a directory listing
+ out prints out the full path of the file
+ returnPipe runs stat and filters on directories
###
_ = require(\'highland\')
fs = require(\'fs\')
fsPath = require(\'path\')
directory = _([\'someDirectory\'])
mergePoint = _()
dirFilesStream = mergePoint.merge().flatMap((parentPath) ->
_.wrapCallback(fs.readdir)(parentPath).sequence().map (path) ->
fsPath.join parentPath, path
)
out = dirFilesStream
# Create the return pipe
returnPipe = dirFilesStream.observe().flatFilter((path) ->
_.wrapCallback(fs.stat)(path).map (v) ->
v.isDirectory()
)
# Connect up the merge point now that we have all of our streams.
mergePoint.write directory
mergePoint.write returnPipe
mergePoint.end()
# Release backpressure. This will print files as they are discovered
out.each H.log
# Another way would be to queue them all up and then print them all out at once.
# out.toArray((files)-> console.log(files))
回答21:
Using Promises (Q) to solve this in a Functional style:
var fs = require(\'fs\'),
fsPath = require(\'path\'),
Q = require(\'q\');
var walk = function (dir) {
return Q.ninvoke(fs, \'readdir\', dir).then(function (files) {
return Q.all(files.map(function (file) {
file = fsPath.join(dir, file);
return Q.ninvoke(fs, \'lstat\', file).then(function (stat) {
if (stat.isDirectory()) {
return walk(file);
} else {
return [file];
}
});
}));
}).then(function (files) {
return files.reduce(function (pre, cur) {
return pre.concat(cur);
});
});
};
It returns a promise of an array, so you can use it as:
walk(\'/home/mypath\').then(function (files) { console.log(files); });
回答22:
I must add the Promise-based sander library to the list.
var sander = require(\'sander\');
sander.lsr(directory).then( filenames => { console.log(filenames) } );
回答23:
Using bluebird promise.coroutine:
let promise = require(\'bluebird\'),
PC = promise.coroutine,
fs = promise.promisifyAll(require(\'fs\'));
let getFiles = PC(function*(dir){
let files = [];
let contents = yield fs.readdirAsync(dir);
for (let i = 0, l = contents.length; i < l; i ++) {
//to remove dot(hidden) files on MAC
if (/^\\..*/.test(contents[i])) contents.splice(i, 1);
}
for (let i = 0, l = contents.length; i < l; i ++) {
let content = path.resolve(dir, contents[i]);
let contentStat = yield fs.statAsync(content);
if (contentStat && contentStat.isDirectory()) {
let subFiles = yield getFiles(content);
files = files.concat(subFiles);
} else {
files.push(content);
}
}
return files;
});
//how to use
//easy error handling in one place
getFiles(your_dir).then(console.log).catch(err => console.log(err));
回答24:
Because everyone should write his own, I made one.
walk(dir, cb, endCb)
cb(file)
endCb(err | null)
DIRTY
module.exports = walk;
function walk(dir, cb, endCb) {
var fs = require(\'fs\');
var path = require(\'path\');
fs.readdir(dir, function(err, files) {
if (err) {
return endCb(err);
}
var pending = files.length;
if (pending === 0) {
endCb(null);
}
files.forEach(function(file) {
fs.stat(path.join(dir, file), function(err, stats) {
if (err) {
return endCb(err)
}
if (stats.isDirectory()) {
walk(path.join(dir, file), cb, function() {
pending--;
if (pending === 0) {
endCb(null);
}
});
} else {
cb(path.join(dir, file));
pending--;
if (pending === 0) {
endCb(null);
}
}
})
});
});
}
回答25:
check out loaddir
https://npmjs.org/package/loaddir
npm install loaddir
loaddir = require(\'loaddir\')
allJavascripts = []
loaddir({
path: __dirname + \'/public/javascripts\',
callback: function(){ allJavascripts.push(this.relativePath + this.baseName); }
})
You can use fileName
instead of baseName
if you need the extension as well.
An added bonus is that it will watch the files as well and call the callback again. There are tons of configuration options to make it extremely flexible.
I just remade the guard
gem from ruby using loaddir in a short while
回答26:
This is my answer. Hope it can help somebody.
My focus is to make the searching routine can stop at anywhere, and for a file found, tells the relative depth to the original path.
var _fs = require(\'fs\');
var _path = require(\'path\');
var _defer = process.nextTick;
// next() will pop the first element from an array and return it, together with
// the recursive depth and the container array of the element. i.e. If the first
// element is an array, it\'ll be dug into recursively. But if the first element is
// an empty array, it\'ll be simply popped and ignored.
// e.g. If the original array is [1,[2],3], next() will return [1,0,[[2],3]], and
// the array becomes [[2],3]. If the array is [[[],[1,2],3],4], next() will return
// [1,2,[2]], and the array becomes [[[2],3],4].
// There is an infinity loop `while(true) {...}`, because I optimized the code to
// make it a non-recursive version.
var next = function(c) {
var a = c;
var n = 0;
while (true) {
if (a.length == 0) return null;
var x = a[0];
if (x.constructor == Array) {
if (x.length > 0) {
a = x;
++n;
} else {
a.shift();
a = c;
n = 0;
}
} else {
a.shift();
return [x, n, a];
}
}
}
// cb is the callback function, it have four arguments:
// 1) an error object if any exception happens;
// 2) a path name, may be a directory or a file;
// 3) a flag, `true` means directory, and `false` means file;
// 4) a zero-based number indicates the depth relative to the original path.
// cb should return a state value to tell whether the searching routine should
// continue: `true` means it should continue; `false` means it should stop here;
// but for a directory, there is a third state `null`, means it should do not
// dig into the directory and continue searching the next file.
var ls = function(path, cb) {
// use `_path.resolve()` to correctly handle \'.\' and \'..\'.
var c = [ _path.resolve(path) ];
var f = function() {
var p = next(c);
p && s(p);
};
var s = function(p) {
_fs.stat(p[0], function(err, ss) {
if (err) {
// use `_defer()` to turn a recursive call into a non-recursive call.
cb(err, p[0], null, p[1]) && _defer(f);
} else if (ss.isDirectory()) {
var y = cb(null, p[0], true, p[1]);
if (y) r(p);
else if (y == null) _defer(f);
} else {
cb(null, p[0], false, p[1]) && _defer(f);
}
});
};
var r = function(p) {
_fs.readdir(p[0], function(err, files) {
if (err) {
cb(err, p[0], true, p[1]) && _defer(f);
} else {
// not use `Array.prototype.map()` because we can make each change on site.
for (var i = 0; i < files.length; i++) {
files[i] = _path.join(p[0], files[i]);
}
p[2].unshift(files);
_defer(f);
}
});
}
_defer(f);
};
var printfile = function(err, file, isdir, n) {
if (err) {
console.log(\'--> \' + (\'[\' + n + \'] \') + file + \': \' + err);
return true;
} else {
console.log(\'... \' + (\'[\' + n + \'] \') + (isdir ? \'D\' : \'F\') + \' \' + file);
return true;
}
};
var path = process.argv[2];
ls(path, printfile);
回答27:
Here\'s a recursive method of getting all files including subdirectories.
const FileSystem = require(\"fs\");
const Path = require(\"path\");
//...
function getFiles(directory) {
directory = Path.normalize(directory);
let files = FileSystem.readdirSync(directory).map((file) => directory + Path.sep + file);
files.forEach((file, index) => {
if (FileSystem.statSync(file).isDirectory()) {
Array.prototype.splice.apply(files, [index, 1].concat(getFiles(file)));
}
});
return files;
}
回答28:
Another simple and helpful one
function walkDir(root) {
const stat = fs.statSync(root);
if (stat.isDirectory()) {
const dirs = fs.readdirSync(root).filter(item => !item.startsWith(\'.\'));
let results = dirs.map(sub => walkDir(`${root}/${sub}`));
return [].concat(...results);
} else {
return root;
}
}
回答29:
This is how I use the nodejs fs.readdir function to recursively search a directory.
const fs = require(\'fs\');
const mime = require(\'mime-types\');
const readdirRecursivePromise = path => {
return new Promise((resolve, reject) => {
fs.readdir(path, (err, directoriesPaths) => {
if (err) {
reject(err);
} else {
if (directoriesPaths.indexOf(\'.DS_Store\') != -1) {
directoriesPaths.splice(directoriesPaths.indexOf(\'.DS_Store\'), 1);
}
directoriesPaths.forEach((e, i) => {
directoriesPaths[i] = statPromise(`${path}/${e}`);
});
Promise.all(directoriesPaths).then(out => {
resolve(out);
}).catch(err => {
reject(err);
});
}
});
});
};
const statPromise = path => {
return new Promise((resolve, reject) => {
fs.stat(path, (err, stats) => {
if (err) {
reject(err);
} else {
if (stats.isDirectory()) {
readdirRecursivePromise(path).then(out => {
resolve(out);
}).catch(err => {
reject(err);
});
} else if (stats.isFile()) {
resolve({
\'path\': path,
\'type\': mime.lookup(path)
});
} else {
reject(`Error parsing path: ${path}`);
}
}
});
});
};
const flatten = (arr, result = []) => {
for (let i = 0, length = arr.length; i < length; i++) {
const value = arr[i];
if (Array.isArray(value)) {
flatten(value, result);
} else {
result.push(value);
}
}
return result;
};
Let\'s say you have a path called \'/database\' in your node projects root. Once this promise is resolved, it should spit out an array of every file under \'/database\'.
readdirRecursivePromise(\'database\').then(out => {
console.log(flatten(out));
}).catch(err => {
console.log(err);
});
回答30:
Yet another answer, but this time using TypeScript:
/**
* Recursively walk a directory asynchronously and obtain all file names (with full path).
*
* @param dir Folder name you want to recursively process
* @param done Callback function, returns all files with full path.
* @param filter Optional filter to specify which files to include,
* e.g. for json files: (f: string) => /.json$/.test(f)
*/
const walk = (
dir: string,
done: (err: Error | null, results ? : string[]) => void,
filter ? : (f: string) => boolean
) => {
let results: string[] = [];
fs.readdir(dir, (err: Error, list: string[]) => {
if (err) {
return done(err);
}
let pending = list.length;
if (!pending) {
return done(null, results);
}
list.forEach((file: string) => {
file = path.resolve(dir, file);
fs.stat(file, (err2, stat) => {
if (stat && stat.isDirectory()) {
walk(file, (err3, res) => {
if (res) {
results = results.concat(res);
}
if (!--pending) {
done(null, results);
}
}, filter);
} else {
if (typeof filter === \'undefined\' || (filter && filter(file))) {
results.push(file);
}
if (!--pending) {
done(null, results);
}
}
});
});
});
};