0

我正在尝试使用以下 Python 代码解析转发电子邮件的正文

import imapclient
import os
import pprint
import pyzmail
import email

#my email info
EMAIL_ADRESS = os.environ.get('DB_USER')
EMAIL_PASSWORD = os.environ.get('PYTHON_PASS')

#login to my email
imap0bj =  imapclient.IMAPClient('imap.gmail.com', ssl = True)
imap0bj.login(EMAIL_ADRESS, EMAIL_PASSWORD )
print("ok")


pprint.pprint(imap0bj.list_folders())
#Selecting my Inbox
imap0bj.select_folder('INBOX', readonly = True)

#Getting UIDs from Inbox
UIDs = imap0bj.search(['SUBJECT', 'Contact FB Applicant', 'ON', '16-Oct-2020'])
print(UIDs)


rawMessages = imap0bj.fetch(UIDs, ['BODY[]'])
message = pyzmail.PyzMessage.factory(rawMessages[9999][b'BODY[]'])

message.text_part != None
#Body of the email returned as a string
msg = message.text_part.get_payload().decode(message.text_part.charset)

print(msg)

imap0bj.logout()

此代码输出与此类似的字符串

   ---------- Forwarded message ---------
    From: Someone <Mail@mail.biz>
    Date: Wed, Oct 14, 2020 at 1:23 PM
    Subject: Fwd:  Contact FB Applicant
    To: <mail@mail.com>
    
    
    
    
   ---------- Forwarded message ---------
    From: Someone <Mail@mail.biz>
    Date: Wed, Oct 14, 2020 at 1:23 PM
    Subject: Fwd:  Contact FB Applicant
    To: <mail@mail.com>
    
    
    The following applicant filled out the form via Facebook.  Contact
    immediately.
    
    Some Guy
    999999999999
    mail@mail.com

但我不想要“转发的消息”部分。我只想从“以下申请人......”中获得它,然后是我关心的信息。我如何摆脱其他东西?我真的很感激帮助。谢谢!

4

2 回答 2

0

您可以使用io.StringIO

以下是您将如何使用它。

from io import StringIO

# your code goes here
...
...

msg = message.text_part.get_payload().decode(message.text_part.charset)

sio = StringIO(msg)

sio.seek(msg.index('The following applicant'))

for line in sio:
  print(line)

这个怎么运作:

StringIO允许您将字符串视为流(文件)。StringIO.seek将流位置移动到特定位置。(0 是流的开头) str.index返回字符串中字符串的第一个位置。把它们放在一起:你将流的开头移动到你想要的字符串的第一次出现,然后从流中读取。

于 2020-10-17T04:41:43.940 回答
0

从这个格式来看,需要逐行阅读。如果您遇到以 '---' 开头的行,例如 line[:3]='---' 您会忽略它及其后面的行,直到您读取空行,如果它以 '---' 开头再次,重复该过程然后第一个非空行应该是“以下申请人......”

您可以将这段代码埋在无限循环中并中断,这是伪代码

while True:
  line = read next line
  if length(line) ==0: continue
  if line[:3] = '---'
    while true:
      line = read next line
      if line:
        break
      else:
        continue
  else:
    break
read lines and print everthing from here

假设读取行函数记录了它已经读取了多少行以及将要读取哪一行。

于 2020-10-17T04:44:50.427 回答