1

首先,我想让你知道我在编码方面相对较新,我对 Python 和 Javascript 的了解很肤浅。

我有这个巨大的 txt,其中包含数据结构如下的团队名称和名称:

Name1, Surname1  Team1
                  Team2
                  Team3
Name2, Surname2  Team2
                  Team4
Name3, Surname3  Team1
                  Team5

理想情况下,我想通过 Team# 提取我的数据搜索并返回属于它的人的姓名。

例如。我需要team1 和team2 的组件。我的新 txt 输出应如下所示:

Team1, Name1, Surname1, Name3, Surname3
Team2, Name1, Surname1, Name2, Surname2

非常感谢您的帮助

4

1 回答 1

0

Python 版本可能如下所示:

fobj_in = io.StringIO("""Name1, Surname1  Team1
                  Team2
                  Team3
Name2, Surname2  Team2
                  Team4
Name3, Surname3  Team1
                  Team5""")

fobj_out = io.StringIO()

from collections import defaultdict

teams = defaultdict(list)

for line in fobj_in:
    items = line.split()
    if len(items) == 3:
        name = items[:2]
        team = items[2]
    else:
        team = items[0]
    teams[team].append(name)

for team_name in sorted(teams.keys()):
    fobj_out.write(team_name + ', ')
    for name in teams[team_name][:-1]:
        fobj_out.write('{} {}, '.format(name[0], name[1]))
    name = teams[team_name][-1]
    fobj_out.write('{} {}\n'.format(name[0], name[1]))


fobj_out.seek(0)
print(fobj_out.read())

输出:

Team1, Name1, Surname1, Name3, Surname3
Team2, Name1, Surname1, Name2, Surname2
Team3, Name1, Surname1
Team4, Name2, Surname2
Team5, Name3, Surname3

只需执行此操作即可读取和写入实际文件:

fobj_in = open('in_file.txt')
fobj_out = open('out_file.txt', 'w')

编辑

注意:示例数据似乎不包含会导致输出中的一行出现多个名称的情况。

有了这个输入数据,我们需要更改代码:

from collections import defaultdict
teams = defaultdict(list)
for line in fobj_in:
    if not line.strip():
        continue
    items = [entry.strip() for entry in line.split('\t') if entry]
    if len(items) == 2:
        name = items[0]
        team = items[1]
    else:
        team = items[0]
    teams[team].append(name)
for team_name in sorted(teams.keys()):
    fobj_out.write(team_name + ', ')
    for name in teams[team_name][:-1]:
        fobj_out.write('{}, '.format(name))
    name = teams[team_name][-1]
    fobj_out.write('{}\n'.format(name))

生成的文件内容如下所示:

"Décore ta vie" (2003), Boilard, Naggy
"Mouki" (2010), Boileau, Sonia
A chacun sa place (2011), Boinem, Victor Emmanuel
Absence (2009) (V), Boillat, Patricia
C.A.L.L.E. (2005), Boillat, Patricia
Comment devenir un trou de cul et enfin plaire aux femmes (2004), Boire, Roger
Couleur de peau: Miel (2012), Boileau, Laurent
Hergé:Les aventures de Tintin (2004), Boillot, Olivier
Isola, là dove si parla la lingua di Bacco (2011)  (co-director), Boillat, Patricia
L'île (2011), Boillot, Olivier
La beauté fatale et féroce... (1996), Boire, Roger
Last Call Indian (2010), Boileau, Sonia
Le Temple Oublié (2005), Boillot, Olivier
Le pied tendre (1988), Boire, Roger
Legit (2006), Boinski, James W.
Nubes (2010), Boira, Francisco
Questions nationales (2009), Boire, Roger
Reconciling Rwanda (2007), Boiko, Patricia
Soviet Gymnasts (1955), Boikov, Vladimir
The Corporal's Diary (2008) (V)  (head director), Boiko, Patricia
Un gars ben chanceux (1977), Boire, Roger
于 2013-06-03T12:06:56.667 回答