今天我需要完成一个类似的任务,所以我根据我所做的一些研究写了一个完成任务的要点。我希望有人会发现这很有用!
import os
os.system('cls' if os.name == 'nt' else 'clear')
oldfile = raw_input('{*} Enter the file (with extension) you would like to strip domains from: ')
newfile = raw_input('{*} Enter the name of the file (with extension) you would like me to save: ')
emailDomains = ['windstream.net', 'mail.com', 'google.com', 'web.de', 'email', 'yandex.ru', 'ymail', 'mail.eu', 'mail.bg', 'comcast.net', 'yahoo', 'Yahoo', 'gmail', 'Gmail', 'GMAIL', 'hotmail', 'comcast', 'bellsouth.net', 'verizon.net', 'att.net', 'roadrunner.com', 'charter.net', 'mail.ru', '@live', 'icloud', '@aol', 'facebook', 'outlook', 'myspace', 'rocketmail']
print "\n[*] This script will remove records that contain the following strings: \n\n", emailDomains
raw_input("\n[!] Press any key to start...\n")
linecounter = 0
with open(oldfile) as oFile, open(newfile, 'w') as nFile:
for line in oFile:
if not any(domain in line for domain in emailDomains):
nFile.write(line)
linecounter = linecounter + 1
print '[*] - {%s} Writing verified record to %s ---{ %s' % (linecounter, newfile, line)
print '[*] === COMPLETE === [*]'
print '[*] %s was saved' % newfile
print '[*] There are %s records in your saved file.' % linecounter
链接到 Gist:emailStripper.py
最好的,阿兹