0

我在这里阅读 Python 3 文档,我一定是盲人或什么的......它在哪里说如何获取消息的正文?

我想要做的是打开一条消息并在基于文本的消息正文中执行一些循环,跳过二进制附件。伪代码:

def read_all_bodies(local_email_file):
    email = Parser().parse(open(local_email_file, 'r'))
    for pseudo_body in email.pseudo_bodies:
        if pseudo_body.pseudo_is_binary():
            continue
        # Pseudo-parse the body here

我怎么做?甚至 Message 类是正确的类吗?不只是为了标题吗?

4

1 回答 1

1

最好使用两个函数来完成:

  1. 一打开文件。如果消息是单部分的,则get_payload在消息中返回字符串。如果消息是多部分的,则返回子消息列表
  2. 第二个处理文本/有效负载

这是可以做到的:

def parse_file_bodies(filename):
    # Opens file and parses email
    email = Parser().parse(open(filename, 'r'))
    # For multipart emails, all bodies will be handled in a loop
    if email.is_multipart():
        for msg in email.get_payload():
            parse_single_body(msg)
    else:
        # Single part message is passed diractly
        parse_single_body(email)

def parse_single_body(email):
    payload = email.get_payload(decode=True)
    # The payload is binary. It must be converted to
    # python string depending in input charset
    # Input charset may vary, based on message
    try:
        text = payload.decode("utf-8")
        # Now you can work with text as with any other string:
        ...
    except UnicodeDecodeError:
        print("Error: cannot parse message as UTF-8")
        return  
于 2016-11-29T23:37:23.223 回答