我有这个正则表达式,可以在我的文本字符串中找到 HTML 锚标记之间的文本。文本类似于推文,因此示例字符串为:
http://google.com is great, but http://www.stackoverflow.com may be my only hope. www.yahoo.com is out of the question.
它贯穿我的工作正则表达式:
function processTweetLinks(text) {
console.log(text);
text = text.replace();
var replacedText, replacePattern1, replacePattern2;
//URLs starting with http://, https://, or ftp://
replacePattern1 = /(\b(https?|ftp):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/gim;
text = text.replace(replacePattern1, '<a class="individualMessageBioWhateverLink" href="$1" target="_blank">$1</a>');
//URLs starting with "www." (without // before it, or it'd re-link the ones done above).
replacePattern2 = /(^|[^\/])(www\.[\S]+(\b|$))/gim;
text = text.replace(replacePattern2, '$1<a class="individualMessageBioWhateverLink" href="http://$2" target="_blank">$2</a>');
console.log(text);
return text;
}
结果如下:
<a class="individualMessageBioWhateverLink" href="http://google.com" target="_blank">http://google.com</a> is great, but <a class="individualMessageBioWhateverLink" href="http://www.stackoverflow.com" target="_blank">http://www.stackoverflow.com</a> may be my only hope. <a class="individualMessageBioWhateverLink" href="http://www.yahoo.com" target="_blank">www.yahoo.com</a> is out of the question.
我想要另一行或两行来抓取<a></a>
标签之间的内容,找到链接的开头(例如,, , http://
. , ),然后删除它们,留下仍然可以识别为用户链接的最短可能文本(“ ”)。我有一个正则表达式,它似乎可以正确筛选文本并在标签之间而不是在 href 中找到这些东西。这是表达式:https://
http://www
https://www.
google.com
<a></a>
(?:<a [^>]+>.*?)(([a-zA-Z]{3,5}\:\/\/){0,1}(www\.))(?:.*?<\/a>)
`http://` and `https://` are in `$2`, and `www.` is in `$3`. How would I use this in the context of my function?