Reading a Gmail Message with ruby-gmail

2019-02-27 08:24发布

问题:

I am looking for an instance method from the ruby-gmail gem that would allow me to read either:

  • the body or

  • subject

of a Gmail message.

After reviewing the documentation, found here, I couldn't find anything!?

There is a .message instance method found in the Gmail::Message class section; but it only returns, for lack of a better term, email "mumbo-jumbo," for the body.

My attempt:

#!/usr/local/bin/ruby
require 'gmail'

gmail = Gmail.connect('username', 'password')

emails = gmail.inbox.emails(:from => 'someone@mail.com')

emails.each do |email|
  email.read
  email.message
end

Now:

  1. email.read does not work
  2. email.message returns that, "mumbo-jumbo," mentioned above

Somebody else asked this question on SO but didn't get an answer.

回答1:

This probably isn't exactly the answer to your question, but I will tell you what I have done in the past. I tried using the ruby-gmail gem but it didn't do what I wanted it to do in terms of reading a message. Or, at least, I couldn't get it to work. Instead I use the built-in Net::IMAP class to log in and get a message.

require 'net/imap'
imap = Net::IMAP.new('imap.gmail.com',993,true)
imap.login('<username>','<password>')
imap.select('INBOX')
subject_id = search_mail(imap, 'SUBJECT', '<mail_subject>')
subject_message = imap.fetch(subject_id,'RFC822')[0].attr['RFC822']
mail = Mail.read_from_string subject_message
body_message = mail.html_part.body

From here your message is stored in body_message and is HTML. If you want the entire email body you will probably need to learn how to use Nokogiri to parse it. If you just want a small bit of the message where you know some of the surrounding characters you can use a regex to find the part you are interested in.

I did find one page associated with the ruby-gmail gem that talks about using ruby-gmail to read a Gmail message. I made a cursory attempt at testing it tonight but apparently Google upped the security on my account and I couldn't get in using irb without tinkering with my Gmail configuration (according to the warning email I received). So I was unable to verify what is stated on that page, but as I mentioned my past attempts were unfruitful whereas Net::IMAP works for me.

EDIT: I found this, which is pretty cool. You will need to add in

require 'cgi'

to your class.

I was able to implement it in this way. After I have my body_message, call the html2text method from that linked page (which I modified slightly and included below since you have to convert body_message to a string):

plain_text = html2text(body_message)
puts plain_text #Prints nicely formatted plain text to the terminal

Here is the slightly modified method:

def html2text(html)
  text = html.to_s.
    gsub(/(&nbsp;|\n|\s)+/im, ' ').squeeze(' ').strip.
    gsub(/<([^\s]+)[^>]*(src|href)=\s*(.?)([^>\s]*)\3[^>]*>\4<\/\1>/i,
'\4')

  links = []
  linkregex = /<[^>]*(src|href)=\s*(.?)([^>\s]*)\2[^>]*>\s*/i
  while linkregex.match(text)
    links << $~[3]
    text.sub!(linkregex, "[#{links.size}]")
  end

  text = CGI.unescapeHTML(
    text.
      gsub(/<(script|style)[^>]*>.*<\/\1>/im, '').
      gsub(/<!--.*-->/m, '').
      gsub(/<hr(| [^>]*)>/i, "___\n").
      gsub(/<li(| [^>]*)>/i, "\n* ").
      gsub(/<blockquote(| [^>]*)>/i, '> ').
      gsub(/<(br)(| [^>]*)>/i, "\n").
      gsub(/<(\/h[\d]+|p)(| [^>]*)>/i, "\n\n").
      gsub(/<[^>]*>/, '')
  ).lstrip.gsub(/\n[ ]+/, "\n") + "\n"

  for i in (0...links.size).to_a
    text = text + "\n  [#{i+1}] <#{CGI.unescapeHTML(links[i])}>" unless
links[i].nil?
  end
  links = nil
  text
end

You also mentioned in your original question that you got mumbo-jumbo with this step:

email.message *returns mumbo-jumbo*

If the mumbo-jumbo is HTML, you can probably just use your existing code with this html2text method instead of switching over to Net::IMAP as I had discussed when I posted my original answer.



回答2:

Nevermind, it's:

email.subject
email.body

silly me

ok, so how do I get the body in "readable" text? without all the encoding stuff and html?



回答3:

Subject, text body and HTML body:

email.subject

if email.message.multipart?
  text_body = email.message.text_part.body.decoded
  html_body = email.message.html_part.body.decoded  
else
  # Only multipart messages contain a HTML body
  text_body = email.message.body.decoded
  html_body = text
end

Attachments:

email.message.attachments.each do |attachment|
  path = "/tmp/#{attachment.filename}"
  File.write(path, attachment.decoded)

  # The MIME type might be useful
  content_type = attachment.mime_type
end


回答4:

require 'gmail'

gmail = Gmail.connect('username', 'password')
emails = gmail.inbox.emails(:from => 'someone@mail.com')
emails.each do |email|
  puts email.subject
  puts email.text_part.body.decoded
end