python - 为收件箱子目录中的电子邮件自动创建 dovecot.sieve 规则的脚本

Question

在这个伟大的网站上浏览/使用解决方案一段时间后，终于到了我参与的时候了。

我对我想要什么有一个非常清晰的概念，但我正在寻找到达那里的最佳方式。

我想要什么？：

一段时间以来，我在树莓派上使用电子邮件服务器设置，到目前为止效果很好。它由一个 dovecot 服务器和一些筛子过滤器组成，这些过滤器设置为将我的许多电子邮件地址分类到单独的收件箱子目录中。还有一个垃圾邮件过滤器，他每晚都会通过脚本学习火腿和垃圾邮件之间的区别。（基本上他被告知垃圾邮件在垃圾文件夹中，而其他每个文件夹都包含火腿）

我想为专用的“通讯”文件夹复制此行为。此文件夹不包含需要立即查看或报告的紧急消息。

计划是手动将电子邮件放入“新闻”文件夹，并让脚本每天扫描一次该文件夹。如果它发现来自没有筛选规则的地址的电子邮件，它应该创建一个规则以在到达时自动将来自该地址的邮件放入“新闻”文件夹。

实现步骤？：

为此，脚本需要扫描现有的 .dovecot.sieve 文件，将“新闻文件夹”规则中的地址提取到单独的文件或对象中进行比较。

/*Example of a sieve filter:*/

require "fileinto";

 /* Global Spam Filter */
if anyof (header :contains "subject" "*SPAM*",
          header :contains "X-Spam-Flag" "YES" ) {
  fileinto "Junk";
  stop;
}

/* LAN Emails Filter */
  elsif address :is "to" "lan@docbrown.pi" {
  fileinto "INBOX.Lokal";
  stop;
}

/* Newsletter Filter */
  elsif anyof (address :is "from" "newsletter@example.com",
               address :is "from" "news@yahoo.de",
               address :is "from" "info@mailbox.de",
               address :is "from" "something@somewhere.de") {
  fileinto "INBOX.Newsletter";
  stop;
}

 /* gmail Account Filter */
  elsif address :is "to" "docbrown@gmail.com" {
  fileinto "INBOX.gmail";
  stop;
}

 /* Yahoo Account Filter */
  elsif address :is "to" "docbrown@yahoo.de" {
  fileinto "INBOX.yahoo";
  stop;
}

  else {
  # The rest goes into INBOX
  # default is "implicit keep", we do it explicitly here
  keep;
}

然后它需要处理“news”文件夹的maildir目录中的所有电子邮件，并在电子邮件中搜索“发件人：”字段和尖括号中的电子邮件地址
```
Date: Mon, 4 Nov 2013 16:38:30 +0100 (CET)
From: Johannes Ebert - Redaktion c't <infoservice@heise.de> 
To: docbrown@example.de
```
将它们与从筛文件中提取的地址进行比较，如果该地址没有过滤规则
（例如，在列表中未找到），则为其创建一个（或简单地将其添加到提取的地址中）
处理完所有电子邮件后，将使用提取的_email_addresses 文件创建“新闻”文件夹的新规则集，
现有的 dovecot.sieve 将被新的替换（
之前将复制旧的，以防万一）
也许之后还需要重新启动鸽舍才能阅读新规则？

目前进展：

我试图通过简单地使用 bash 命令和实用程序来使其工作。这让我接近了一个点，我几乎可以从 dovecot.sieve 文件中提取电子邮件地址，但这对我来说非常复杂并且需要一些时间。

#!/bin/sh

cp /home/mailman/.dovecot.sieve /home/mailman/autosieve/dovecot.sieve_`date +backup_%d%m%Y`
#echo "" > search.txt

X=grep -n "Newsletter Filter" /home/mailman/.dovecot.sieve #get rule start line number, some magic needs to happen here to just apply the numbers and not the full output by grep
Y=grep -n "INBOX.Newsletter" /home/mailman/.dovecot.sieve #get rule end line number
$X++  #increment to go into the next line
$Y--  #decrement to go into the previous line
sed -n ‘$X,$Yp’ /home/mailman/.dovecot.sieve > /home/mailman/search.txt  #copy lines into separate search_file
less /home/mailman/search.txt | awk -F '"' '{ if ($2 != "") print $4 }' > /home/mailman/adressen.txt # filter addresses and export to separate file

所以我想知道我是否可以通过使用 python 更轻松地到达那里。我在另一个 raspberry 项目中对其进行了修补，但没有时间完全沉浸在 python 世界中。

所以我很乐意在这里获得一些帮助/建议/指出正确的方向。

到目前为止，我找到了一些解决类似问题的解决方案（第一部分），其中需要提取，但我无法完全适应它，或者由于我无法执行脚本而犯了一些错误。

#!/usr/bin/python

file = open("dovecot.sieve", "r")

rule = {}
current_rule = None

for line in file:
    line = line.split()

    if (line[2] == "INBOX.Newsletter"):
        break
    if (line[1] == "/* Newsletter Filter */"):
        current_rule = rule.setdefault('Newsletter', [])
        continue
    if (line[5] == "from"):
        current_rule.append(line[6])
        continue
    if (line[3] == "from"):
        current_rule.append(line[4])
        continue


file.close()

# Now print out all the data
import pprint
print "whole array"
print "=============================="
pprint.pprint(rule)
print 
print "addresses found"
print "=========================="
pprint.pprint(rule['Newsletter'])

有人还可以推荐一个带有调试器等的python IDE吗？Eclipse 会出现在我的脑海中，或者还有其他什么（可能不是那么资源匮乏）？

score 0 · Accepted Answer

好的，所以我有一些空闲时间来解决我自己的问题。进行了一些挖掘并阅读了一些代码片段并在 Eclipse 中使用 Pydev 对其进行了测试。

现在我在晚上将此脚本作为 cron 作业运行。

它有什么作用？

它收集 dovecot.sieve 文件中的所有电子邮件地址（以及“Newsletter”规则集中的那些）。然后通过将它们与收集的地址进行比较，在 INBOX.Newsletter 文件夹中查找任何未注册的电子邮件地址。如果找到新地址，它会保存旧筛子文件的副本，然后重写现有文件。新的电子邮件地址被插入到“Newsletter”规则集中，因此这些电子邮件被重定向到指定的 Newsletter 文件夹中。

#!/usr/bin/python2.7

import os, sys
#Get the already configured email senders...
addresses = {}
current_addresses = None

with open("/home/postman/.dovecot.sieve", "r") as sieveconf:
    for line in sieveconf:
        if "INBOX.Newsletter" in line:
            break

        if "Newsletter Filter" in line:
            current_addresses = addresses.setdefault('found', [])
            continue

        if "from" in line and current_addresses != None:
            line = line.split('"')

            if (len(line) > 4) and (line[1] == "from"):
                current_addresses.append(line[3])

                continue

#save the count for later
addr_num = 0
addr_num = len(addresses['found'])

#iterate all files in all sub-directories of INBOX.Newsletter
for root, _,files in os.walk("/home/postman/Mails/.INBOX.Newsletter"):
    #for each file in the current directory
    for emaildir in files:
        #open the file
        with open(os.path.join(root, emaildir), "r") as mail:
            #scan line by line
            for line in mail:
                if "From: " in line:
                    #arm boolean value for adding to list
                    found_sw = False
                    #extract substring from line
                    found = ((line.split('<'))[1].split('>')[0])
                    #compare found address with already existing addresses in dictionary
                    for m_addr in addresses['found']:
                        if m_addr == found:
                            #remember if the address is already in the dictionary
                            found_sw = True
                            break

                    if not found_sw:
                        #if the address is not included in the dictionary put it there
                        current_addresses.append(found)
                    break


# Now print out all the data
#import pprint
#print "addresses found:"
#print "=========================="
#pprint.pprint(addresses['found'])
#print
#print "orig_nmbr_of_addresses:" , addr_num
#print "found_nmbr_of_addresses:", len(addresses['found'])
#print "not_recorded_addresses:", (len(addresses['found']) - (addr_num))

#Compare if the address count has changed
if addr_num == len(addresses['found']):
    #exit the script since no new addresses have been found
    sys.exit
else:
    #copy original sieve file for backup
    import datetime
    from shutil import copyfile
    backupfilename = '.backup_%s.sieve'% datetime.date.today()
    copyfile('dovecot.sieve', backupfilename)

    #edit the existing sieve file and add the new entries
    import fileinput
    #open file for in place editing
    for line in fileinput.input('dovecot.sieve', inplace=1):
        #if the line before the last entry is reached
        if addresses['found'][(addr_num - 2)] in line:
            #print the line
            print line,
            #put new rules before the last line (just to avoid extra handling for last line, since the lines before are rather identical)
            for x in range (addr_num, (len(addresses['found']))):
                print '               address :is "from" "%s",'% addresses['found'][x]
        else:
            #print all other lines
            print line,

python - 为收件箱子目录中的电子邮件自动创建 dovecot.sieve 规则的脚本

1 回答 1

Related

Reference