python - Python csv/excel 将单列转换为多行

Question

我有一个如下所示的列表，我需要将其转换为 excel 或 csv 格式的多行

<tr>
<th>Name</th>
<th>Address1</th>
<th>City</th>
<th>State</th>
<th>Zip</th>
</tr>

<tr>
<th>John</th>
<th>111 Michigan</th>
<th>Chicago </th>
<th>IL</th>
<th>60661</th>
</tr>

期望的结果：

Name   Address1       City   State  Zip
John  111 Michigan   Chicago  IL    60661

score 0 · Accepted Answer

我可能会为此使用pandas库。您可以将表格变成DataFrame（有点像 Excel 工作表），尽管我们必须添加<table>标记，因为您的文本中缺少它们：

import pandas as pd
with open("name.html") as fp:
    text = fp.read()

df = pd.read_html("<table>" + text + "</table>", infer_types=False)[0]

这给了我们

>>> df
      0             1        2      3      4
0  Name      Address1     City  State    Zip
1  John  111 Michigan  Chicago     IL  60661

我们可以将其保存为csv文件：

>>> df.to_csv("out.csv", sep="|", index=False, header=False)

给予

Name|Address1|City|State|Zip
John|111 Michigan|Chicago|IL|60661

或直接保存为 Excel 文件：

>>> df.to_excel("out.xlsx")

pandas是我进行数据处理的首选工具。

score 0 · Accepted Answer

0

使用Beautiful Soup解析 HTML 并为每一行打印列值。

于 2013-10-22T21:27:43.270 回答

score 0 · Accepted Answer

我试过使用 beautifulSoup4，但我只能得到第一行作为我的结果。其余的行是否为空白

from bs4 import BeautifulSoup

soup = BeautifulSoup(open("CofATX.txt"))
table = soup.find('table')

rows = table.findAll('tr')

for tr in rows:
    cols = tr.findAll('th')
for th in cols:
    text = ''.join(th.text.strip())
    print text + "|",
print

我得到的结果是 Name | 地址1 | 城市 | 状态 | Zip 如果行为空白，则其余部分

python - Python csv/excel 将单列转换为多行

3 回答 3

Related

Reference