Python gmail imap - get text of email body not in

2019-08-07 09:57发布

问题:

I've been trying to figure this out, and find the solution here on stackoverflow and other place, but i can't get it (not enough experience in Python I guess), so please help:

I'm using the imaplib and email libraries in Python to get emails from my gmail account. I can login and find the mail which I want, and I have implemented the script to capture multipart emails, but the output text of the body of the email (via get_payload method) is a single string, and I would like to get the body of the email as it was sent, so that each new line (as a string) is separated and stored into a list. Please check out the part of my code:

    mail = imaplib.IMAP4_SSL('imap.gmail.com', 993)
    mail.login('mymail@gmail.com', 'password')
    mail.select("inbox")
    date = (datetime.datetime.now() - datetime.timedelta(days=1)).strftime("%d-%b-%Y")
    result, data = mail.uid('search', 'UNSEEN', '(SENTSINCE {date} FROM "someone@gmail.com")'.format(date=date))
    latest_email_uid = data[0].split()[-1]
    result, data = mail.uid('fetch', latest_email_uid, '(RFC822)')
    raw_email = data[0][1]
    email_message = email.message_from_string(raw_email)
    text = ''
    if email_message.is_multipart():
            html = None
            for part in email_message.get_payload():
                if part.get_content_charset() is None:
                    text = part.get_payload(decode=True)
                    continue
                charset = part.get_content_charset()
                if part.get_content_type() == 'text/plain':
                    text = unicode(part.get_payload(decode=True), str(charset), "ignore").encode('windows-1250', 'replace')
                if part.get_content_type() == 'text/html':
                    html = unicode(part.get_payload(decode=True), str(charset), "ignore").encode('windows-1250', 'replace')
            if text is not None:
                text.strip()
            else:
                html.strip()
    else:
        text = unicode(email_message.get_payload(decode=True), email_message.get_content_charset(), 'ignore').encode('windows-1250', 'replace')
        text.strip()
    print text

beforehand I have some more code and at the top are the imported libraries required to run the code, so no need for checking that. I've tried to declare the text = [], i've tried not to strip() text or html,.. but i just can't get it. Is there a simple way to get the text of the body as it was sent, each string in it's own line? I feel that it's so simple but i dont get it.. Thanks in advance!!