Detecing password protected PPT and XLS documents

2019-05-05 00:48发布

问题:

I found this answer https://stackoverflow.com/a/14336292/1537195 which gave a good way to detect password protection for DOC and XLS files.

//Flagged with password
if (bytes.Skip(0x20c).Take(1).ToArray()[0] == 0x2f) return true; //XLS 2003
if (bytes.Skip(0x214).Take(1).ToArray()[0] == 0x2f) return true; //XLS 2005
if (bytes.Skip(0x20B).Take(1).ToArray()[0] == 0x13) return true; //DOC 2005

However it does not seem to cover all XLS files and I am also looking for a way to detect PPT files in the same manner. Does anyway know which bytes to look at for these file types?

回答1:

I saved a PowerPoint presentation as .ppt and .pptx with and without a password required for opening them, opened them in 7-Zip and came to the tentative conclusion that

  • .pptx files without a password always use a standard .zip file format
  • .ppt files are CompoundDocuments
  • .pptx files with a password also CompoundDocuments
  • All passworded CompoundDocuments contain an entry named *Encrypt*

To get this code running, you need to installed the NuGet package OpenMcdf. This is the first C# library that I could find for reading CompoundDocuments.

using OpenMcdf;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;

namespace _22916194
{
    //http://stackoverflow.com/questions/22916194/detecing-password-protected-ppt-and-xls-documents
    class Program
    {
        static void Main(string[] args)
        {
            foreach (var file in args.Where(File.Exists))
            {
                switch (Path.GetExtension(file))
                {
                    case ".ppt":
                    case ".pptx":
                        Console.WriteLine($"* {file} " +  (HasPassword(file) ? "is " : "isn't ") + "passworded");
                        Console.WriteLine();
                        break;

                    default:
                        Console.WriteLine($" * Unknown file type: {file}");
                        break;
                }
            }

            Console.ReadLine();

        }

        private static bool HasPassword(string file)
        {
            try
            {
                using (var compoundFile = new CompoundFile(file))
                {
                    var entryNames = new List<string>();
                    compoundFile.RootStorage.VisitEntries(e => entryNames.Add(e.Name), false);

                    //As far as I can see, only passworded files contain an entry with a name containing Encrypt
                    foreach (var entryName in entryNames)
                    {
                        if (entryName.Contains("Encrypt"))
                            return true;
                    }
                    compoundFile.Close();

                }
            }
            catch (CFFileFormatException) {
                //This is probably a .zip file (=unprotected .pptx)
                return false;
            }
            return false;
        }
    }
}

You should be able to extend this code to handle other Office formats. The conclusions at the top should hold true, except that you need to look for some other data in the CompoundDocument than a filename containing *Encrypt* (I had a quick look at .doc files and it didn't seem to work exactly the same).