-->

How to handle unzipping ZipFile with paths that ar

2020-03-12 02:53发布

问题:

When unzipping files in Windows, I'll occasionally have problems with paths

  1. that are too long for Windows (but okay in the original OS that created the file).
  2. that are "duplicate" due to case-insensitivity

Using DotNetZip, the ZipFile.Read(path) call will crap out whenever reading zip files with one of these problems. Which means I can't even try filtering it out.

using (ZipFile zip = ZipFile.Read(path))
{
    ...
}

What is the best way to handle reading those sort of files?

Updated:

Example zip from here: https://github.com/MonoReports/MonoReports/zipball/master

Duplicates: https://github.com/MonoReports/MonoReports/tree/master/src/MonoReports.Model/DataSourceType.cs https://github.com/MonoReports/MonoReports/tree/master/src/MonoReports.Model/DatasourceType.cs

Here is more detail on the exception:

Ionic.Zip.ZipException: Cannot read that as a ZipFile
---> System.ArgumentException: An > item with the same key has already been added.
at System.ThrowHelper.ThrowArgumentException(ExceptionResource resource)
at System.Collections.Generic.Dictionary2.Insert(TKey key, TValue value, Boolean add)
at System.Collections.Generic.Dictionary
2.Add(TKey key, TValue value)
at Ionic.Zip.ZipFile.ReadCentralDirectory(ZipFile zf)
at Ionic.Zip.ZipFile.ReadIntoInstance(ZipFile zf)

Resolution:

Based on @Cheeso's suggestion, I can read everything from the stream, those avoiding duplicates, and path issues:

//using (ZipFile zip = ZipFile.Read(path))
using (ZipInputStream stream = new ZipInputStream(path))
{
    ZipEntry e;
    while( (e = stream.GetNextEntry()) != null )
    //foreach( ZipEntry e in zip)
    {
        if (e.FileName.ToLower().EndsWith(".cs") ||
            e.FileName.ToLower().EndsWith(".xaml"))
        {
            //var ms = new MemoryStream();
            //e.Extract(ms);
            var sr = new StreamReader(stream);
            {
                //ms.Position = 0;
                CodeFiles.Add(new CodeFile() { Content = sr.ReadToEnd(), FileName = e.FileName });
            }
        }
    }
}

回答1:

Read it with ZipInputStream.

The ZipFile class keeps a collection using the filename as the index. Duplicate filenames breaks that model.

But you can use the ZipInputStream to read in your ZipFile. There is no collection or index in that case.



回答2:

For the PathTooLongException problem, I found that you can't use DotNetZip. Instead, what I did was invoke the command-line version of 7-zip; that works wonders.

public static void Extract(string zipPath, string extractPath)
{
    try
    {
        ProcessStartInfo processStartInfo = new ProcessStartInfo
        {
            WindowStyle = ProcessWindowStyle.Hidden,
            FileName = Path.GetFullPath(@"7za.exe"),
            Arguments = "x \"" + zipPath + "\" -o\"" + extractPath + "\""
        };
        Process process = Process.Start(processStartInfo);
        process.WaitForExit();
        if (process.ExitCode != 0) 
        {
            Console.WriteLine("Error extracting {0}.", extractPath);
        }
    }
    catch (Exception e)
    {
        Console.WriteLine("Error extracting {0}: {1}", extractPath, e.Message);
        throw;
    }
}