How to unzip a zip file using scala?

2020-02-05 03:56发布

问题:

Basically I need to unzip a .zip file which contains a folder called modeled which in turn contains a number of excel files.

I have had some luck in finding code that was already written (ZipArchive) which is meant to unzip the zip file, but I cannot figure out why it throws an error message when I use it. The code for ZipArchive and the error message are listed below:

import java.io.{OutputStream, InputStream, File, FileOutputStream}
import java.util.zip.{ZipEntry, ZipFile}
import scala.collection.JavaConversions._

object ZipArchive {

  val BUFSIZE = 4096
  val buffer = new Array[Byte](BUFSIZE)

  def unZip(source: String, targetFolder: String) = {
    val zipFile = new ZipFile(source)

    unzipAllFile(zipFile.entries.toList, getZipEntryInputStream(zipFile)_, new File(targetFolder))
  }

  def getZipEntryInputStream(zipFile: ZipFile)(entry: ZipEntry) = zipFile.getInputStream(entry)

  def unzipAllFile(entryList: List[ZipEntry], inputGetter: (ZipEntry) => InputStream, targetFolder: File): Boolean = {

    entryList match {
      case entry :: entries =>

        if (entry.isDirectory)
          new File(targetFolder, entry.getName).mkdirs
        else
          saveFile(inputGetter(entry), new FileOutputStream(new File(targetFolder, entry.getName)))

        unzipAllFile(entries, inputGetter, targetFolder)
      case _ =>
        true
    }
  }

  def saveFile(fis: InputStream, fos: OutputStream) = {
    writeToFile(bufferReader(fis)_, fos)
    fis.close
    fos.close
  }

  def bufferReader(fis: InputStream)(buffer: Array[Byte]) = (fis.read(buffer), buffer)

  def writeToFile(reader: (Array[Byte]) => Tuple2[Int, Array[Byte]], fos: OutputStream): Boolean = {
    val (length, data) = reader(buffer)
    if (length >= 0) {
      fos.write(data, 0, length)
      writeToFile(reader, fos)
    } else
      true
  }
}

Error Message:

java.io.FileNotFoundException: src/test/resources/oepTemp/modeled/EQ_US_2_NULL_('CA')_ALL_ELT_IL_EQ_US.xlsx (No such file or directory), took 6.406 sec
[error]     at java.io.FileOutputStream.open(Native Method)
[error]     at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
[error]     at java.io.FileOutputStream.<init>(FileOutputStream.java:171)
[error]     at com.contract.testing.ZipArchive$.unzipAllFile(ZipArchive.scala:28)
[error]     at com.contract.testing.ZipArchive$.unZip(ZipArchive.scala:15)
[error]     at com.contract.testing.OepStepDefinitions$$anonfun$1.apply$mcZ$sp(OepStepDefinitions.scala:175)
[error]     at com.contract.testing.OepStepDefinitions$$anonfun$1.apply(OepStepDefinitions.scala:150)
[error]     at com.contract.testing.OepStepDefinitions$$anonfun$1.apply(OepStepDefinitions.scala:150)
[error]     at cucumber.api.scala.ScalaDsl$StepBody$$anonfun$apply$1.applyOrElse(ScalaDsl.scala:61)
[error]     at cucumber.api.scala.ScalaDsl$StepBody$$anonfun$apply$1.applyOrElse(ScalaDsl.scala:61)
[error]     at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
[error]     at cucumber.runtime.scala.ScalaStepDefinition.execute(ScalaStepDefinition.scala:71)
[error]     at cucumber.runtime.StepDefinitionMatch.runStep(StepDefinitionMatch.java:37)
[error]     at cucumber.runtime.Runtime.runStep(Runtime.java:298)
[error]     at cucumber.runtime.model.StepContainer.runStep(StepContainer.java:44)
[error]     at cucumber.runtime.model.StepContainer.runSteps(StepContainer.java:39)
[error]     at cucumber.runtime.model.CucumberScenario.run(CucumberScenario.java:48)
[error]     at cucumber.runtime.junit.ExecutionUnitRunner.run(ExecutionUnitRunner.java:91)
[error]     at cucumber.runtime.junit.FeatureRunner.runChild(FeatureRunner.java:63)
[error]     at cucumber.runtime.junit.FeatureRunner.runChild(FeatureRunner.java:18)
[error]     ...

So based on the error message it looks like it's trying to find the exported excel file? This part completely throws me off. Any help would be greatly appreciated. I've added below how I'm calling the method, perhaps I'm doing something silly. Also I'm up for using a different way to extract my zip file if you can recommend one.

val tempDirectoryDir = "src/test/resources/oepTemp/"
ZipArchive.unZip(tempDirectoryDir + "Sub Region Input - Output.zip", tempDirectoryDir)

回答1:

Well since were are using some utilities from java, here is a version basen on this, translated to scala, maybe this should be more functional, but it is useful

package zip

import java.io.{ IOException, FileOutputStream, FileInputStream, File }
import java.util.zip.{ ZipEntry, ZipInputStream }

/**
 * Created by anquegi on 04/06/15.
 */
object Unzip extends App {

  val INPUT_ZIP_FILE: String = "src/main/resources/my-zip.zip";
  val OUTPUT_FOLDER: String = "src/main/resources/my-zip";

  def unZipIt(zipFile: String, outputFolder: String): Unit = {

    val buffer = new Array[Byte](1024)

    try {

      //output directory
      val folder = new File(OUTPUT_FOLDER);
      if (!folder.exists()) {
        folder.mkdir();
      }

      //zip file content
      val zis: ZipInputStream = new ZipInputStream(new FileInputStream(zipFile));
      //get the zipped file list entry
      var ze: ZipEntry = zis.getNextEntry();

      while (ze != null) {

        val fileName = ze.getName();
        val newFile = new File(outputFolder + File.separator + fileName);

        System.out.println("file unzip : " + newFile.getAbsoluteFile());

        //create folders
        new File(newFile.getParent()).mkdirs();

        val fos = new FileOutputStream(newFile);

        var len: Int = zis.read(buffer);

        while (len > 0) {

          fos.write(buffer, 0, len)
          len = zis.read(buffer)
        }

        fos.close()
        ze = zis.getNextEntry()
      }

      zis.closeEntry()
      zis.close()

    } catch {
      case e: IOException => println("exception caught: " + e.getMessage)
    }

  }

  Unzip.unZipIt(INPUT_ZIP_FILE, OUTPUT_FOLDER)

}


回答2:

Here's a more functional and precise way doing this

import java.io.{FileInputStream, FileOutputStream}
import java.util.zip.ZipInputStream
val fis = new FileInputStream("htl.zip")
val zis = new ZipInputStream(fis)
Stream.continually(zis.getNextEntry).takeWhile(_ != null).foreach{ file =>
    val fout = new FileOutputStream(file.getName)
    val buffer = new Array[Byte](1024)
    Stream.continually(zis.read(buffer)).takeWhile(_ != -1).foreach(fout.write(buffer, 0, _))
}


回答3:

Trying to work with Tian-Liang's solution, I realized that its not working for zips with a directory structure. So I adopted it this way:

  import java.io.{FileOutputStream, InputStream}
  import java.nio.file.Path
  import java.util.zip.ZipInputStream

  def unzip(zipFile: InputStream, destination: Path): Unit = {
    val zis = new ZipInputStream(zipFile)

    Stream.continually(zis.getNextEntry).takeWhile(_ != null).foreach { file =>
      if (!file.isDirectory) {
        val outPath = destination.resolve(file.getName)
        val outPathParent = outPath.getParent
        if (!outPathParent.toFile.exists()) {
          outPathParent.toFile.mkdirs()
        }

        val outFile = outPath.toFile
        val out = new FileOutputStream(outFile)
        val buffer = new Array[Byte](4096)
        Stream.continually(zis.read(buffer)).takeWhile(_ != -1).foreach(out.write(buffer, 0, _))
      }
    }
  }


回答4:

import java.io.FileInputStream
import java.io.InputStream
import java.util.zip.ZipEntry
import java.util.zip.ZipInputStream
import scala.language.reflectiveCalls
import scala.util.Try

import org.apache.commons.io.IOUtils

def using[T <: { def close() }, U](resource: T)(block: T => U): U = {
  try {
    block(resource)
  } finally {
    if (resource != null) {
        resource.close()
    }
  }
}

def processZipFile(zipFile: ZipFile)(doStuff: ZipEntry => Unit) {
  using(new ZipInputStream(new FileInputStream(zipFile))) { zipInputStream =>
    val entries = Stream.continually(Try(zipInputStream.getNextEntry()).getOrElse(null))
        .takeWhile(_ != null) // while not EOF and not corrupted
        .foreach(doStuff)
        .force
  }
}


回答5:

Late to the game here, but I would take advantage of scala.collection.JavaConverters to get a for-loop over the zip file entries, and java.nio.Files to get simple copying and directory creation:

import java.nio.file.{Files, Path}
import java.util.zip.ZipFile
import scala.collection.JavaConverters._

def unzip(zipPath: Path, outputPath: Path): Unit = {
  val zipFile = new ZipFile(zipPath.toFile)
  for (entry <- zipFile.entries.asScala) {
    val path = outputPath.resolve(entry.getName)
    if (entry.isDirectory) {
      Files.createDirectories(path)
    } else {
      Files.createDirectories(path.getParent)
      Files.copy(zipFile.getInputStream(entry), path)
    }
  }
}


回答6:

I would merge the automatic calling of ZipFile.close() into Steve's answer. It is made by the using method which is present in Noel Yap's answer.

import java.nio.file.{Files, Path}
import java.util.zip.ZipFile
import scala.collection.JavaConverters._

def using[T <: {def close()}, U](resource: T)(block: T => U): U = {
  try {
    block(resource)
  } finally {
    if (resource != null) {
      resource.close()
    }
  }
}

def unzip(zipPath: Path, outputPath: Path): Unit = {
  using(new ZipFile(zipPath.toFile)) { zipFile =>
    for (entry <- zipFile.entries.asScala) {
      val path = outputPath.resolve(entry.getName)
      if (entry.isDirectory) {
        Files.createDirectories(path)
      } else {
        Files.createDirectories(path.getParent)
        Files.copy(zipFile.getInputStream(entry), path)
      }
    }
  }
}