How to make stream pipeline simpler

2019-04-17 15:24发布

问题:

I think my code needs improvement. I use object allSummaryTSTLog both in the stream's filter() and map() stages, so I have to call File.listFiles twice:

public static List<Test> ParserPath(List<String> allLogPath) {

    FilenameFilter filter = new MyFilter("Summary_TSTLog");

    return allLogPath.parallelStream().filter(path -> {
        File testPath = new File(path);

        if (!testPath.isDirectory()) {
            MyLog.log.info("test path : [" + path + "] is not exist, continue");
            return false;
        }

        File[] allSummaryTSTLog = testPath.listFiles(filter);
        if (allSummaryTSTLog == null || allSummaryTSTLog.length == 0) {
            MyLog.log.info("test path : [" + path + "] has no Summary_TSTLog files");
            return false;
        }
        return true;
    }).map(path -> {
        String[] nameTempStr = path.split("\\\\");
        String testName = nameTempStr[nameTempStr.length - 1];

        File[] allSummaryTSTLog = new File(path).listFiles(filter);

        return new Test(testName, Arrays.asList(allSummaryTSTLog));
    }).collect(Collectors.toList());
}

How can I call File.listFiles() to create allSummaryTSTLog only once?

回答1:

It's not uncommon to want to filter stream elements based on some computation, and then reuse the result of that computation in a later pipeline stage. You could of course recompute that result in the later pipeline stage, but it's quite reasonable if you don't want to do that. If the stream elements themselves can't store those computed results, and you don't want to recompute them, you have to create a helper class to carry the original elements and computed results down the stream.

I looked at the data that's used in the pipeline, and here's the helper class that I came up with:

static class FileInfo {
    final String fullName;
    final String lastName;
    final File[] allSummaryTSTLog;

    FileInfo(String n, FilenameFilter fnf) {
        fullName = n;
        String[] tmp = n.split("\\\\");
        lastName = tmp[tmp.length - 1];
        allSummaryTSTLog = new File(n).listFiles(fnf);
    }
}

Note that I've cheated a bit here for brevity. First, instead of checking for isDirectory explicitly, I take advantage of the fact that File.listFiles returns null if the File is not a directory. I also haven't bothered to differentiate the case where the given filename is not a directory vs. the case where it is a directory but contains no matching files. (That might be significant for you, however; I'm not sure.)

Finally, I do the filename path splitting here in the constructor so that last component of the name is available when necessary. The splitting requires a local variable. If the splitting were done in a later pipeline stage, the local variable would force the use of a statement lambda instead of an expression lambda, which would make things a lot more cumbersome. The tradeoff is that we might end up doing pathname splitting for files that end up being filtered away, but that doesn't seem like an excessive expense.

With this helper class in place, the pipeline can be rewritten as follows:

 static List<Test> parserPath(List<String> allLogPath) {
    FilenameFilter filter = new MyFilter("Summary_TSTLog");
    return allLogPath.parallelStream()
        .map(str -> new FileInfo(str, filter))
        .filter(fi -> fi.allSummaryTSTLog != null && fi.allSummaryTSTLog.length > 0)
        .map(fi -> new Test(fi.lastName, Arrays.asList(fi.allSummaryTSTLog)))
        .collect(toList());
}

In the first pipeline stage, we map the incoming stream element into an instance of our helper class. The subsequent stages can use the data in the helper class without having to recompute it.



回答2:

This is 2015. DON'T USE File.

Moreover there is no clue at all as to what you want to do (what does you MyFilter do? What is Test?).

Use something like this:

final BiPredicate<Path, BasicFileAttributes> predicate = (path, attrs) -> {
    return attrs.isRegularFile()
        && path.getFileName().toString().equals(something);
};

try (
    final Stream<Path> stream = Files.find(baseDir, filter);
) {
    // work with the stream
}

Since your post contains zero clues as to what you want to do, this is the best which can be done.

As to how to obtain a Path, see Paths.get(). And the doc of java.nio.file in general.