Say if I want to parse a large file like this :
val iStream = MyFile::class.java
.getResourceAsStream("largeFile.txt")
iStream.bufferedReader(Charsets.UTF_8).useLines { lines ->
lines.filterNot { it.startsWith("#") }
// parseing
.toSet()
}
But if I want to split the largeFile to multiple smaller files , how to chain the sequences ?
For example :
val seq1 = MyFile::class.java.getResourceAsStream("file1.txt")
.use { it.bufferedReader(Charsets.UTF_8).lineSequence() }
val seq2 = MyFile::class.java.getResourceAsStream("file2.txt")
.use { it.bufferedReader(Charsets.UTF_8).lineSequence() }
sequenceOf(seq1, seq2).flatten()
.filterNot { it.startsWith("#") }
// parsing
.toSet()
It will throw java.io.IOException: Stream closed
, which is reasonable , because the parsing is outside the scope
of the use
block.
How to solve the problem ?
I know there may be some nesting solution (nesting useLines
... ) , but I think that is ugly . Is there any other flat
solutions ?
You could invert your logic. Important is, that everything is got or handled within the use
otherwise that will not work, as you already know.
One such ~invertion could look like:
setOf("file1.txt", "file2.txt")
.map { MyFile::class.java.getResourceAsStream(it) }
.flatMap {
it.use {
it.bufferedReader(Charsets.UTF_8)
.lineSequence()
.filterNot { it.startsWith("#") }
.toSet()
}
}
Or if you want to pass the chain transformation or filter from outside, maybe something like:
val handleLine : (Sequence<String>) -> Sequence<String> = {
it.filterNot { it.startsWith("#") }
// .map { ... whatever }
}
setOf("file1.txt", "file2.txt")
.map { MyFile::class.java.getResourceAsStream(it) }
.flatMap {
it.use {
handleLine(it.bufferedReader(Charsets.UTF_8).lineSequence())
.toSet()
}
}
The other alternative is to open up the streams, omit use
and finally close them yourself as also @MarkoTopolnik pointed out in the comments:
val inputStreams = sequenceOf("file1.txt", "file2.txt")
.map { MyFile::class.java.getResourceAsStream(it) }
inputStreams.flatMap { it.bufferedReader(Charsets.UTF_8).lineSequence() }
.filterNot { it.startsWith("#") }
.toSet()
Then either use:
inputStreams.forEach(InputStream::close) // but this will fail on the first error...
or the "safe" way:
inputStreams.forEach { try { it.close() } catch (e: Exception) { e.printStackTrace() } }