Kotlin to chain multiple sequences from different

2019-08-14 01:52发布

问题:

Say if I want to parse a large file like this :

val iStream = MyFile::class.java
    .getResourceAsStream("largeFile.txt")

iStream.bufferedReader(Charsets.UTF_8).useLines { lines ->
    lines.filterNot { it.startsWith("#") }
    // parseing
    .toSet()
}

But if I want to split the largeFile to multiple smaller files , how to chain the sequences ?

For example :

val seq1 = MyFile::class.java.getResourceAsStream("file1.txt")
    .use { it.bufferedReader(Charsets.UTF_8).lineSequence() }
val seq2 = MyFile::class.java.getResourceAsStream("file2.txt")
    .use { it.bufferedReader(Charsets.UTF_8).lineSequence() }

sequenceOf(seq1, seq2).flatten()
  .filterNot { it.startsWith("#") }
  // parsing
  .toSet()

It will throw java.io.IOException: Stream closed , which is reasonable , because the parsing is outside the scope of the use block.

How to solve the problem ?

I know there may be some nesting solution (nesting useLines ... ) , but I think that is ugly . Is there any other flat solutions ?

回答1:

You could invert your logic. Important is, that everything is got or handled within the use otherwise that will not work, as you already know.

One such ~invertion could look like:

setOf("file1.txt", "file2.txt")
  .map { MyFile::class.java.getResourceAsStream(it) }
  .flatMap {
    it.use {
      it.bufferedReader(Charsets.UTF_8)
        .lineSequence()
        .filterNot { it.startsWith("#") }
        .toSet()
    }
  }

Or if you want to pass the chain transformation or filter from outside, maybe something like:

val handleLine : (Sequence<String>) -> Sequence<String> = {
  it.filterNot { it.startsWith("#") }
  // .map { ... whatever }
}
setOf("file1.txt", "file2.txt")
  .map { MyFile::class.java.getResourceAsStream(it) }
  .flatMap {
    it.use {
      handleLine(it.bufferedReader(Charsets.UTF_8).lineSequence())
        .toSet()
    }
  }

The other alternative is to open up the streams, omit use and finally close them yourself as also @MarkoTopolnik pointed out in the comments:

val inputStreams = sequenceOf("file1.txt", "file2.txt")
  .map { MyFile::class.java.getResourceAsStream(it) }

inputStreams.flatMap { it.bufferedReader(Charsets.UTF_8).lineSequence() }
  .filterNot { it.startsWith("#") }
  .toSet()

Then either use:

inputStreams.forEach(InputStream::close) // but this will fail on the first error...

or the "safe" way:

inputStreams.forEach { try { it.close() } catch (e: Exception) { e.printStackTrace() } }