1

i have a email template which is having email context in html formate,

now i wanted to find the zip number from the email html content,

for that i have used regex to search the zip code, the content is like Formate 1:

helllo this is the mail  which will converted in the lead 
and here is some addresss  which will not be used..

and the 
zip: 364001
city: New york

formate 2:

<p><b>Name</b></p><br/>
fname
<p><b>Last Name</b></p><br/>
lname
<p><b>PLZ</b></p><br/>
71392
<p><b>mail</b></p><br/>
heliconia72@mail.com

the code looks like,

regex = r'(?P<zip>Zip:\s*\d\d\d\d\d\d)'
zip_match = re.search(regex, mail_content) # find zip
zip_match.groups()[0]

this is searching for fomate 2 only, how can i write a regex so it work for both the formate.

4

1 回答 1

1

如果你真的需要为此使用正则表达式(我可能会使用BeautifulSoup第二个),你可以使用它,例如:

regex = r'(?:zip:\s*|PLZ</b></p><br/>\n)(\d{5})'
zip_match = re.search(regex1, mail_content)
zip_match.groups()[0]
于 2013-08-23T08:08:16.697 回答