regex - Python 2.7 正则表达式与所需模式不匹配

Question

我正在解析包含我的 IPTV 播放列表数据的 .m3u 文件的所有行。我希望在以下格式的文件中隔离和打印字符串部分：

tvg-logo="http//somelinkwithapicture.png"

..在一个看起来像这样的字符串中：

#EXTINF:-1 catchup="default" catchup-source="http://someprovider.tv/play/dvr/${start}/2480.m3u8?token=%^%=&duration=3600" catchup-days=5 tvg-name="Sky Sports Action HD" tvg-id="SkySportsAction.uk" tvg-logo="http://someprovider.tv/logos/sky%20sports%20action%20hd.png" group-title="Sports",Sky Sports Action HD
http://someprovider.tv/play/2480.m3u8?token=465454=

我的课看起来像这样：

import re

class iptv_cleanup():

    filepath = 'C:\\Users\\cg371\\Downloads\\vget.m3u'

    with open(filepath, "r") as text_file:
        a = text_file.read()
        b = re.search(r'tvg-logo="(.*?)"', a)
        c = b.group()
        print c

    text_file.close

iptv_cleanup()

我得到的只是一个这样的字符串：

tvg-logo=""

我对正则表达式有点生疏，但我看不出有什么明显的错误。

有人可以帮忙吗？

谢谢

score 0 · Accepted Answer

查看(?:tvg-logo=\")[\w\W]*(?<=.png)

import re
reg = '(?:tvg-logo=\")[\w\W]*(?<=.png)'

string = '#EXTINF:-1 catchup="default" catchup-source="http://someprovider.tv/play/dvr/${start}/2480.m3u8?token=%^%=&duration=3600" catchup-days=5 tvg-name="Sky Sports Action HD" tvg-id="SkySportsAction.uk" tvg-logo="http://someprovider.tv/logos/sky%20sports%20action%20hd.png" group-title="Sports",Sky Sports Action HD http://someprovider.tv/play/2480.m3u8?token=465454='

print re.findall(reg,string, re.DOTALL)[0]

$python main.py
tvg-logo="http://someprovider.tv/logos/sky%20sports%20action%20hd.png

score 0 · Accepted Answer

这最终奏效了：

import re

class iptv_cleanup():

    filepath = 'C:\\Users\\cg371\\Downloads\\vget.m3u'

    with open(filepath, "r") as text_file:
        a = text_file.read()
        b = re.findall(r'tvg-logo="(.*?)"', a)

        for i in b:

            print i


    text_file.close

iptv_cleanup()

谢谢大家的输入...

regex - Python 2.7 正则表达式与所需模式不匹配

2 回答 2

Related

Reference