我有以下输入字符串
input= """href="http://www.sciencedirect.com/science/article/pii/S0167923609002097" onmousedown="return scife_clk(this.href,'','res','2')">Using <b>text mining </b>and sentiment analysis for online forums hotspot detection and forecast</a></h3><div class="gs_a">N Li, <a href="/citations?
href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3309177/" onmousedown="return scife_clk(this.href,'ggp','res','1')">How to link ontologies and protein–protein interactions to literature: <b>text</b>-<b>mining </b>approaches and the BioCreative experience</a></h3><div class="gs_a"><a href="/citations?
href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3309177/" onmousedown="return scife_clk(this.href,'gga','gga','1')"><span class="gs_ggsL"><span class=gs_ctg2>[HTML]</span> from nih.gov</span><span class="gs_ggsS">nih.gov <span """
我想从中提取以下输出:
>> Using <b>text mining </b>and sentiment analysis for online forums hotspot detection and forecast
>> How to link ontologies and protein–protein interactions to literature: <b>text</b>-<b>mining </b>approaches and the BioCreative experience
我正在尝试在 python 中使用 re 包,但我不清楚要使用什么正则表达式,因为有几种模式,例如:
(this.href,'','res','2')"> or (this.href,'ggp','res','2')"> or (this.href,'gga','gga','2')">
我正在使用这个正则表达式:
=re.search(r"(this.href,'ggp.?','res','.?/D')"
但这对我不起作用。谁能告诉我要使用什么?