How to read the Content Type header and convert in

2020-08-15 12:05发布

问题:

So I get emails using imap from gmail and outlook.

Gmail encodes like this =?UTF-8?B?UmU6IM69zq3OvyDOtc68zrHOuc67IG5ldyBlbWFpbA==?= and outlook encodes like this =?iso-8859-7?B?UmU6IOXr6+ft6er8IHN1YmplY3Q=?=

Unfortunately I did not find yet any solution that will help me make this into readable text. Instead I am messing with:

mb_convert_encoding($body, "UTF-8", "UTF-8"); 

and

mb_convert_encoding($body, "UTF-8", "iso-8859-7");

but I am struggling to find a solution to solve this matter.

This is how I open the IMAP of my account (which has a lot of gmail and outlook messages)

$hostname = '{imappro.zoho.com:993/imap/ssl}INBOX';
$username = 'email@email.com';
$password = 'password';


/* try to connect */
$inbox = imap_open($hostname,$username ,$password) or die('Cannot connect to Zoho: ' . imap_last_error());

/* grab emails */
$emails = imap_search($inbox,'UNSEEN');

Any help?

回答1:

Unfortunately I did not find yet any solution that will help me make this into readable text.

Solution Your strings are base64 encoded.

=?UTF-8?B?UmU6IM69zq3OvyDOtc68zrHOuc67IG5ldyBlbWFpbA==?=

echo base64_decode('UmU6IM69zq3OvyDOtc68zrHOuc67IG5ldyBlbWFpbA==');

prints "Re: νέο εμαιλ new email"

=?iso-8859-7?B?UmU6IOXr6+ft6er8IHN1YmplY3Q=?=

echo base64_decode('UmU6IOXr6+ft6er8IHN1YmplY3Q=');

prints out "Re: subject"

The answer is to use base64_decode in conjunction with your current solutions.

The way to identify base64 encoded text is that it's depicted as letters a-z, A-Z, numbers 0-9 along with two other characters (usually + and /) and it's usually right padded with =.

EDIT:

Sorry, I was already forgetting that the question was to convert from iso-8859-7 to UTF-8 and have it visible.

<?php
$str = base64_decode('UmU6IPP03evt+SDs3u317OE=');
$str = mb_convert_encoding($str,'UTF-8','iso-8859-7');
echo $str;
?>

The result is "Re: στέλνω μήνυμα"



回答2:

look here

   /* connect to gmail */
    $hostname = '{imap.gmail.com:993/imap/ssl}INBOX';
    $username = 'davidwalshblog@gmail.com';
    $password = 'davidwalsh';

    /* try to connect */
    $inbox = imap_open($hostname,$username,$password) or die('Cannot connect to Gmail: ' . imap_last_error());

    /* grab emails */
    $emails = imap_search($inbox,'ALL');

    /* if emails are returned, cycle through each... */
    if($emails) {

        /* begin output var */
        $output = '';

        /* put the newest emails on top */
        rsort($emails);

        /* for every email... */
        foreach($emails as $email_number) {

            /* get information specific to this email */
            $overview = imap_fetch_overview($inbox,$email_number,0);
            $message = imap_fetchbody($inbox,$email_number,2);

            /* output the email header information */
            $output.= '<div class="toggler '.($overview[0]->seen ? 'read' : 'unread').'">';
            $output.= '<span class="subject">'.$overview[0]->subject.'</span> ';
            $output.= '<span class="from">'.$overview[0]->from.'</span>';
            $output.= '<span class="date">on '.$overview[0]->date.'</span>';
            $output.= '</div>';

            /* output the email body */
            $output.= '<div class="body">'.$message.'</div>';
        }

        echo $output;
    } 

    /* close the connection */
    imap_close($inbox);

for reading and decoding look here

<?php
$hostname = '{********:993/imap/ssl}INBOX';
$username = '*********';
$password = '******';

$inbox = imap_open($hostname,$username,$password) or die('Cannot connect to server: ' . imap_last_error());

$emails = imap_search($inbox,'ALL');

if($emails) {
    $output = '';
    rsort($emails);

    foreach($emails as $email_number) {
        $overview = imap_fetch_overview($inbox,$email_number,0);
        $structure = imap_fetchstructure($inbox, $email_number);

        if(isset($structure->parts) && is_array($structure->parts) && isset($structure->parts[1])) {
            $part = $structure->parts[1];
            $message = imap_fetchbody($inbox,$email_number,2);

            if($part->encoding == 3) {
                $message = imap_base64($message);
            } else if($part->encoding == 1) {
                $message = imap_8bit($message);
            } else {
                $message = imap_qprint($message);
            }
        }

        $output.= '<div class="toggle'.($overview[0]->seen ? 'read' : 'unread').'">';
        $output.= '<span class="from">From: '.utf8_decode(imap_utf8($overview[0]->from)).'</span>';
        $output.= '<span class="date">on '.utf8_decode(imap_utf8($overview[0]->date)).'</span>';
        $output.= '<br /><span class="subject">Subject('.$part->encoding.'): '.utf8_decode(imap_utf8($overview[0]->subject)).'</span> ';
        $output.= '</div>';

        $output.= '<div class="body">'.$message.'</div><hr />';
    }

    echo $output;
}

imap_close($inbox);
?>

Look here for great tutorial on email structure, and function to extract it.



回答3:

If you want to decode header elements, there is a PHP function for that: imap_mime_header_decode().

Also, you will need some MIME parser class to decode multipart messages.



回答4:

To get the headers, you would pass your stream ($inbox) to imap_headers(). There are lots of values you can get in the response, full list: imap_headerinfo

For the actual messages, plain text can be read using imap_body(), passing the stream and the number of the message you want (in $emails after your search). Getting an html/multipart email is a bit trickier. First you need imap_fetchstructure(), which identifies the parts of the message, then imap_fetchbody() to get the piece you are interested in.

Once you have a result from imap_fetchbody(), if you still need to adjust the encoding, it could be done at this point.



回答5:

I had a task to receive letters from a certain mailbox, parse them and index certain content.

I wanted to have some microservice that would provide me with the data.

  1. Downloading the required content
  2. Convert the received data into a readable format
  3. process the content

So I decided to use ready-made tools.

  1. script for getting emails - imap2maildir
  2. Unix client for processing messages mu
  3. dos2unix converter

Next, I wrote a small bash script that I placed in cron

#!/bin/bash
python /var/mail_dump/imap2maildir/imap2maildir -c /var/mail_dump/imap2maildir/deploy.conf
mu index --maildir=/var/mail_dump/dumps/new
#clean old data
rm -rf /var/mail_dump/extract/*

#search match messages
mu find jivo --fields="l" --nocolor | xargs $1 cp -t /var/mail_dump/extract
#converting
dos2unix -f /var/mail_dump/extract/*

#reassembly of messages in html
cd /var/mail_dump/extract/
for i in /var/mail_dump/extract/*
do
  mu extract --parts=0 --overwrite "$i"
  rm "$i"
done

Complete ! I got a service that constantly receives emails and prepares them for processing. php work with the prepared data without thinking about the implementation of low-level logic.