Read PST files from win32 or pypff

2020-07-27 08:35发布

问题:

I want to read PST files using Python. I've found 2 libraries win32 and pypff

Using win32 we can initiate a outlook object using:

import win32com.client

outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6)

The GetDefaultFolder(6) gets the inbox folder. And then I can use this folders functions and attribute to work with.

But what I want is to give my own pst files which pywin32(or any other library) can read. Here it only connects with my Outlook Application

With pypff I can use the below code to work with pst files:

import pypff
pst_file = pypff.file()
pst_file.open('test.pst')

root = pst_file.get_root_folder()

for folder in root.sub_folders:
    for sub in folder.sub_folders:
        for message in sub.sub_messages:
            print(message.get_plain_text_body()

But I want attributes like the size of the message and also like to access calendars in the pst files which is not available in pypff(not that I know of)

Question

  1. How can I read PST files to get data like the size of the email, the types of attachments it has and the calendars?
  2. Is it possible? Is there a work around in win32, pypff or any other library?

回答1:

This is something that I want to do for my own application. I was able to piece together a solution from these sources:

  1. https://gist.github.com/attibalazs/d4c0f9a1d21a0b24ff375690fbb9f9a7
  2. https://github.com/matthewproctor/OutlookAttachmentExtractor
  3. https://docs.microsoft.com/en-us/office/vba/api/outlook.namespace

The third link above should give additional details about available attributes and various item types. My solution still needs to connect to your Outlook application, but it should be transparent to the user since the pst store is automatically removed using in the try/catch/finally block. I hope this helps you get on the right track!

import win32com.client

def find_pst_folder(OutlookObj, pst_filepath) :
    for Store in OutlookObj.Stores :
        if Store.IsDataFileStore and Store.FilePath == pst_filepath :
            return Store.GetRootFolder()
    return None

def enumerate_folders(FolderObj) :
    for ChildFolder in FolderObj.Folders :
        enumerate_folders(ChildFolder)
    iterate_messages(FolderObj)

def iterate_messages(FolderObj) :
    for item in FolderObj.Items :
        print("***************************************")
        print(item.SenderName)
        print(item.SenderEmailAddress)
        print(item.SentOn)
        print(item.To)
        print(item.CC)
        print(item.BCC)
        print(item.Subject)

        count_attachments = item.Attachments.Count
        if count_attachments > 0 :
            for att in range(count_attachments) :
                print(item.Attachments.Item(att + 1).Filename)

Outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")

pst = r"C:\Users\Joe\Your\PST\Path\example.pst"
Outlook.AddStore(pst)
PSTFolderObj = find_pst_folder(Outlook,pst)
try :
    enumerate_folders(PSTFolderObj)
except Exception as exc :
    print(exc)
finally :
    Outlook.RemoveStore(PSTFolderObj)