-1

我有一个充满这样的字符串的数据文件:

1682|Scream of Stone (Schrei aus Stein) (1991)|08-Mar-1996

我已经解析了字符串并在 处拆分"|"并将其转储到列表中,所以我有:

['1682', 'Scream of Stone (Schrei aus Stein) (1991)', '08-Mar-1996']

我需要做的是在围绕年份的括号中的位置 1 处进一步拆分列表。如果电影的标题没有括号,我可以轻松做到,但这里不是这样。

如果下一个字符不是数字,我该如何写一些跳过括号拆分的内容?我想结束:

['1682', 'Scream of Stone (Schrei aus Stein)', '1991', '08-Mar-1996']

一些帮助会很棒!谢谢

4

3 回答 3

2

This looks like a job for regular expressions!

import re

data = ['1682', 'Scream of Stone (Schrei aus Stein) (1991)', '08-Mar-1996']

def handleYear(matchobj):
    data.insert(2, matchobj.group(1))
    return ''

data[1] = re.sub(r'\s*\((\d+)\)$', handleYear, data[1])

This removes any string of the form (dddd) from the end of data[1] and inserts it into the next position in data.

于 2013-04-22T01:39:22.560 回答
1

You can use regex split:

import re
title="1682|Scream of Stone (Schrei aus Stein) (1991)|08-Mar-1996"
print re.split('\((\d+)\)', title.split("|")[1])

The re.split splits on regular expressions, i.e., uses regexes as delimiters. If there is a capture in the split expression, the delimiter is also kept in the split result rather than discarded.

The split expression \((\d+)\) first matches literal parentheses \( ... \). and within them matches only digits \d+. But we also capture the digits to keep them, hence \((\d+)\).

于 2013-04-22T01:39:44.647 回答
0

You can use python re module.

>>> import re
>>> s = 'Scream of Stone (Schrei aus Stein) (1991)'
>>> re.findall('\([0-9]+\)', s)
['(1991)']
>>> re.findall('\((\d+)\)', s)
['1991']
>>> 

Once you have the year parsed out. you can insert it in whichever index you want to in the list.

于 2013-04-22T01:39:04.020 回答