-1

我有一堆引用字符串,我想将它们分成一个引用。这是我从 OWL 引文网站上找到的一个例子。我有 MLA、APA 等引用类型的组合。是否有 python 库或其他应用程序可以将这些字符串拆分为列表中的元素。由于引用类型的多样性,我尝试避免使用正则表达式,并且我还尝试按“/n”分割,但是,我的一些字符串没有“/n”分隔符......所以你可以看到这个问题。我想知道是否有更好的捕获方法。我不是在寻找捕获名称、日期、标题...找到一个可以做到这一点的库...我只需要分隔字符串。任何帮助将非常感激!!!!谢谢!!

输入字符串 - 示例

Dean, Cornelia. "Executive on a Mission: Saving the Planet." The New York Times, 22 May 2007, www.nytimes.com/2007/05/22/science/earth/22ander.html?_r=0. Accessed 12 May 2016.

Ebert, Roger. Review of An Inconvenient Truth, directed by Davis Guggenheim. rogerebert.com, 1 June 2006, www.rogerebert.com/reviews/an-inconvenient-truth-2006. Accessed 15 June 2016.

输出 - 样本

['Dean, Cornelia. "Executive on a Mission: Saving the Planet." The New York Times, 22 May 2007, www.nytimes.com/2007/05/22/science/earth/22ander.html?_r=0. Accessed 12 May 2016.',
'Ebert, Roger. Review of An Inconvenient Truth, directed by Davis Guggenheim. rogerebert.com, 1 June 2006, www.rogerebert.com/reviews/an-inconvenient-truth-2006. Accessed 15 June 2016.']
4

2 回答 2

0

尝试split然后删除空元素filter

string = '''Dean, Cornelia. "Executive on a Mission: Saving the Planet." The New York Times, 22 May 2007, www.nytimes.com/2007/05/22/science/earth/22ander.html?_r=0. Accessed 12 May 2016.

Ebert, Roger. Review of An Inconvenient Truth, directed by Davis Guggenheim. rogerebert.com, 1 June 2006, www.rogerebert.com/reviews/an-inconvenient-truth-2006. Accessed 15 June 2016.'''

result = list(filter(None, string.split('\n')))

输出:

['Dean, Cornelia. "Executive on a Mission: Saving the Planet." The New York Times, 22 May 2007, www.nytimes.com/2007/05/22/science/earth/22ander.html?_r=0. Accessed 12 May 2016.', 'Ebert, Roger. Review of An Inconvenient Truth, directed by Davis Guggenheim. rogerebert.com, 1 June 2006, www.rogerebert.com/reviews/an-inconvenient-truth-2006. Accessed 15 June 2016.']
于 2019-02-20T12:02:40.457 回答
0

如果你想s用换行符分隔字符串,\n你可以使用splitlines()带有 listcomp 的 string 方法来过滤空元素:

[i for i in s.splitlines() if i]
于 2019-02-20T12:39:11.720 回答