3

我正在尝试从专辑中获取曲目(歌曲)列表,对于给定的曲目,我想获得所有匹配的曲目。我已经提到了下面的示例,关于如何在 python 中进行此操作的任何想法?似乎 difflib.get_close_matches 只适用于单个单词而不是一个句子。

示例:(查找包含字符串“环游世界”的任何内容

tracks = ['Around The World (La La La La La) (Radio Version)', 'Around The World (La La La La La) (Alternative Radio Version)', 'Around The World (La La La La La) (Acoustic Mix)', 'Around The World (La La La La La) (Rucegsegger#Wittwer Club Mix)', 'World In Motion','My Heart Beats Like A Drum (Dam Dam Dam)','Thinking Of You','Why Oh Why','Mistake No. 2','With You','Love Is Blind','Lonesome Suite','Let Me Come & Let Me Go']

输出:

 Around The World (La La La La La) (Radio Version)
 Around The World (La La La La La) (Alternative Radio Version)
 Around The World (La La La La La) (Acoustic Mix)
 Around The World (La La La La La) (Rüegsegger#Wittwer Club Mix)
4

4 回答 4

8

difflib.get_close_matches可以使用字符串(除了单个单词)。在这种情况下,您需要降低截止值(默认值为 0.6),并提高n最大匹配数:

In [19]: import difflib

In [20]: tracks = ['Around The World (La La La La La) (Radio Version)', 'Around The World (La La La La La) (Alternative Radio Version)', 'Around The World (La La La La La) (Acoustic Mix)', 'Around The World (La La La La La) (Rucegsegger#Wittwer Club Mix)', 'World In Motion','My Heart Beats Like A Drum (Dam Dam Dam)','Thinking Of You','Why Oh Why','Mistake No. 2','With You','Love Is Blind','Lonesome Suite','Let Me Come & Let Me Go']

In [21]: difflib.get_close_matches('Around the world', tracks, n = 4,cutoff = 0.3)
Out[21]: 
['Around The World (La La La La La) (Acoustic Mix)',
 'Around The World (La La La La La) (Radio Version)',
 'Around The World (La La La La La) (Alternative Radio Version)',
 'Around The World (La La La La La) (Rucegsegger#Wittwer Club Mix)']
于 2013-01-27T11:37:44.647 回答
2
filter(lambda x: 'Around The World' in x, tracks)

这将为您提供名称中包含的歌曲列表'Around The World'。如果您使用的是 Python 3,请将其转换为列表 ( list(filter(...))),因为它返回一个filter对象。

如果有错别字,那我帮不了你。

于 2013-01-27T11:33:39.720 回答
1

为此,您可以利用SequenceMatcherget_matching_blocks

>>> from pprint import PrettyPrinter
>>> from difflib import SequenceMatcher
>>> pp = PrettyPrinter(indent = 4)
>>> pp.pprint(tracks)
[   'World In Motion',
    'With You',
    'Why Oh Why',
    'Thinking Of You',
    'My Heart Beats Like A Drum (Dam Dam Dam)',
    'Mistake No. 2',
    'Love Is Blind',
    'Lonesome Suite',
    'Let Me Come & Let Me Go',
    'Around The World (La La La La La) (Rucegsegger#Wittwer Club Mix)',
    'Around The World (La La La La La) (Radio Version)',
    'Around The World (La La La La La) (Alternative Radio Version)',
    'Around The World (La La La La La) (Acoustic Mix)']
>>> seq = ((e, SequenceMatcher(None, 'Around the world', e).get_matching_blocks()[0]) for e in tracks)
>>> seq = [k for k, _ in sorted(seq, key = lambda e:e[-1].size, reverse = True)]
>>> pp.pprint(seq)
[   'Around The World (La La La La La) (Rucegsegger#Wittwer Club Mix)',
    'Around The World (La La La La La) (Radio Version)',
    'Around The World (La La La La La) (Alternative Radio Version)',
    'Around The World (La La La La La) (Acoustic Mix)',
    'World In Motion',
    'With You',
    'Thinking Of You',
    'Why Oh Why',
    'My Heart Beats Like A Drum (Dam Dam Dam)',
    'Mistake No. 2',
    'Love Is Blind',
    'Lonesome Suite',
    'Let Me Come & Let Me Go']
>>> 
于 2013-01-27T11:40:46.440 回答
-2

你可以这样做。

temp = "Around The World (La La La La La)"

for string in fh.readlines():
    if temp in string:
       print temp

如果它与您正在阅读的任何文件中的温度相匹配,它将打印出来。

或者您可以使用正则表达式进行匹配。

于 2013-01-27T11:36:52.607 回答