>>> text = '<a data-lecture-id="47"\n data-modal-iframe="https://class.coursera.org/neuralnets-2012-001/lecture/view?lecture_id=47"\n href="https://class.coursera.org/neuralnets-2012-001/lecture/47"\n data-modal=".course-modal-frame"\n rel="lecture-link"\n class="lecture-link">\nAnother diversion: The softmax output function [7 min]</a>'
>>> import re
>>> re.findall(r'data-lecture-id="(\d+)"|(.*)</a>',a)
>>> [('47', ''), ('', 'Another diversion: The softmax output function [7 min]')]
我如何像这样提取数据:
>>> ['47', 'Another diversion: The softmax output function [7 min]']
我认为应该有一些更聪明的正则表达式。