我有以下代码。这一切都在抓取数据。但我关心的是在每次迭代中将数据写入一行。
这是我的代码
import bs4 as bs
import urllib2
import re
page = urllib2.urlopen("http://www.codissia.com/member/members-directory/?mode=paging&Keyword=&Type=&pg=1")
content = page.read()
soup = bs.BeautifulSoup(content)
eachbox = soup.find_all('div', {'class':re.compile(r'members_box[12]')})
for eachuniversity in eachbox:
data = [re.sub('\s+', '', text).strip().encode('utf8') for text in eachuniversity.find_all(text=True) if text.strip()]
print(','.join(data))
更新
我希望输出像这样(单行)进行迭代
Name:,Mr.Srinivasan.N,Designation:,Proprietor,CODISSIA - Designation:,(Past President, CODISSIA),Name of the Industry:,Arian Soap Manufacturing Co,Specification:,LIFE,Date of Admission:,19.12.1969, "Parijaat" 26/1Shanker Mutt Road, Basavana Gudi,Phone:,2313861
但我得到如下
Name:,Mr.Srinivasan.N,Designation:,Proprietor,CODISSIA - Designation:,(Past President, CODISSIA),Name of the Industry:,Arian Soap Manufacturing Co,Specification:,LIFE,Date of Admission:,19.12.1969
"Parijaat" 26/1Shanker Mutt Road, Basavana Gudi,Phone:,2313861