Suppose I have a Java8 Stream<FileReader>
and that I use that stream to map
and such, how can I control the closing of the FileReader
s used in the stream?
Note, that I may not have access to the individual FileReader
s, for example:
filenames.map(File::new)
.filter(File::exists)
.map(f->{
BufferedReader br = null;
try {
br = new BufferedReader(new FileReader(f));
} catch(Exception e) {}
return Optional.ofNullable(br);
})
.filter(Optional::isPresent)
.map(Optional::get)
.flatMap(...something that reads the file contents...) // From here, the Stream doesn't content something that gives access to the FileReaders
After doing some other mappings, etc, I finally lose the FileReader
s in the sequel.
I first thought the garbage collector is able to do it when needed, but I've experienced OS descriptor exhaustion when filenames
is a long Stream
.
A general note on the use of FileReader: FileReader uses internally a FileInputStream which overrides finalize()
and is therefore discouraged to use beacause of the impact it has on garbarge collection especially when dealing with lots of files.
Unless you're using a Java version prior to Java 7 you should use the java.nio.files API instead, creating a BufferedReader with
Path path = Paths.get(filename);
BufferedReader br = Files.newBufferedReader(path);
So the beginning of your stream pipeline should look more like
filenames.map(Paths::get)
.filter(Files::exists)
.map(p -> {
try {
return Optional.of(Files.newBufferedReader(p));
} catch (IOException e) {
return Optional.empty();
}
})
Now to your problem:
Option 1
One way to preserve the original Reader
would be to use a Tuple. A tuple (or any n-ary variation of it) is generally a good way to handle multiple results of a function application, as it's done in a stream pipeline:
class ReaderTuple<T> {
final Reader first;
final T second;
ReaderTuple(Reader r, T s){
first = r;
second = s;
}
}
Now you can map the FileReader to a Tuple with the second item being your current stream item:
filenames.map(Paths::get)
.filter(Files::exists)
.map(p -> {
try {
return Optional.of(Files.newBufferedReader(p));
} catch (IOException e) {
return Optional.empty();
}
})
.filter(Optional::isPresent)
.map(Optional::get)
.flatMap(r -> new ReaderTuple(r, yourOtherItem))
....
.peek(rt -> {
try {
rt.first.close() //close the reader or use a try-with-resources
} catch(Exception e){}
})
...
Problem with that approach is, that whenever an unchecked exception occurrs during stream execution betweem the flatMap and the peek, the readers might not be closed.
Option 2
An alternative to use a tuple is to put the code that requires the reader in a try-with-resources block. This approach has the advantage that you're in control to close all readers.
Example 1:
filenames.map(Paths::get)
.filter(Files::exists)
.map(p -> {
try (Reader r = new BufferedReader(new FileReader(p))){
Stream.of(r)
.... //put here your stream code that uses the stream
} catch (IOException e) {
return Optional.empty();
}
}) //reader is implicitly closed here
.... //terminal operation here
Example 2:
filenames.map(Paths::get)
.filter(Files::exists)
.map(p -> {
try {
return Optional.of(Files.newBufferedReader(p));
} catch (IOException e) {
return Optional.empty();
}
})
.filter(Optional::isPresent)
.map(Optional::get)
.flatMap(reader -> {
try(Reader r = reader) {
//read from your reader here and return the items to flatten
} //reader is implicitly closed here
})
Example 1 has the advantage that the reader gets certainly closed. Example 2 is safe unless you put something more between the the creation of the reader and the try-with-resources block that may fail.
I personally would go for Example 1, and put the code that is accessing the reader in a separate function so the code is better readable.
Perhaps a better solution is to use a Consumer<FileReader>
to consume each element in the stream.
Another problem you might be running into if there are a lot of files is the files will all be open at the same time. It might be better to close each one as soon as it's done.
Let's say you change the code above into a method that takes a Consumer<BufferedReader>
I probably wouldn't use a stream for this but we can use one anyway to show how one would use it.
public void readAllFiles( Consumer<BufferedReader> consumer){
Objects.requireNonNull(consumer);
filenames.map(File::new)
.filter(File::exists)
.forEach(f->{
try(BufferedReader br = new BufferedReader(new FileReader(f))){
consumer.accept(br);
} catch(Exception e) {
//handle exception
}
});
}
This way we make sure we close each reader and can still support doing whatever the user wants.
For example this would still work
readAllFiles( br -> System.out.println( br.lines().count()));
So, if you only have non-binary files you could use something like this:
List<String> fileNames = Arrays.asList(
"C:\\Users\\wowse\\hallo.txt",
"C:\\Users\\wowse\\bye.txt");
fileNames.stream()
.map(Paths::get)
.filter(Files::exists)
.flatMap(path -> {
try {
return Files.lines(path);
} catch (Exception e) {
e.printStackTrace();
}
return null;
})
.forEach(System.out::println);
If you have binary files which you can hold in memory you can try following approach.
fileNames.stream()
.map(Paths::get)
.filter(Files::exists)
.map(path -> {
try {
return Files.readAllBytes(path);
} catch (Exception e) {
e.printStackTrace();
}
return null;
})
.filter(Objects::nonNull)
.map(String::new)
.forEach(System.out::println);
Other than that I think you would have to use some wrapper class, where I could suggest Map.Entry
or the Pair
from javafx so you don't have to use any external libraries.
I know this is an old question, but there is a very nice solution that I found here and I don't think it's on SO.
The java.nio.file
offers methods that let you do this cleanly:
filenames.map(Paths::get)
// Nicer alternative to File::exists
.filter(Files::exists)
// This will automatically close the stream after each file is done reading
.flatMap(path -> {
try {
return Files.lines(path);
} catch (IOException e) {
// Seamlessly handles error opening file, no need for filtering
return Stream.empty();
}
})
.map(/* Do something with each line... */)
Just for the sake of the argument ( though I agree with Louis above ) :
You can pass around the original Reader
/InputStream
( or any object, but the case you provided is actually flawed programming, because you can pass the FileReader
instead of encapsulating it with BufferedReader
) using common-lang3 Pair
Class. Jool is also a valid library that provides Tuple*
classes.
Example :
filenames.map(File::new)
.filter(File::exists)
.map(f->{
BufferedReader br = null;
FileReader fr = null;
try {
fr = new FileReader(f)
br = new BufferedReader(fr);
return Optional.of(Pair.of(br,fr)) ;
} catch(Exception e) {}
return Optional.ofNullable(br);
})
.filter(Optional::isPresent)
.map(Optional::get)
.flatMap( pair -> {
try {
// do something with br
} finally {
try {
pair.getRight().close();
} catch (IOException x ){
throw new RuntimeException(x) ;
}
}
})