javascript - 如何让 RegEx 获取整个 URL ...从 http 开始并在出现空格之前获取所有内容以及如何排除某些字符

Question

好的，我有一个 asp 文件将一个 rss 提要从 twitter 拉到我的服务器上，我使用 AJAX 分解每个条目并编写 HTML。我希望能够从条目的描述部分中提取链接，但我无法正确编写 RegEx。

$(entry).find('item').each(function() {
    // gets the "id", "title", and "url" of current child element
    $elm = $(this);
    $title = $elm.find('title').text();
    $desc = $elm.find('description').text();
    $pubDate = $elm.find('pubDate').text();
    $guid = $elm.find('guid').text();
    $link = $elm.find('link').text();
    $div.append('<div class="section" id="entry'+$count+'"><h3 class="pubDate">'+$pubDate.slice(0, -6)+'</h3><h3 class="desc">'+$desc+'</h3><div class="linkBox"><a href="'+$link+'" title="'+$title+'" class="link">'+$link+'</a></div></div>');

    $href = $desc.match(/\b(http|https)?(:\/\/)?(\S*)\.(\w{2,4})\b/ig);

    alert($href);
    $count++
});

这是我到目前为止所拥有的：

这是示例推文（原始字符串）：

I'm at Harrah's Hotel and Casino: Luxury Suite (New Orleans, LA) w/ 2 others http://t.co/UjxTIdiJ

我想使用以下方法提取链接：

$desc.match(/\b(http|https)?(:\/\/)?(\S*)\.(\w{2,4})\b/ig);

但它只返回：

http://t.co

我正在努力让所有字符都通过 http 直到出现空格字符，同时排除逗号等...

score 1 · Accepted Answer

这个正则表达式应该可以解决问题：\s*(?i)href\s*=\s*(\"([^"]*\")|'[^']*'|([^'">\s]+)).

示例：http ://regex101.com/r/eL3wV4

或者，如果您没有内联 a href:(http:[^\s]*)|(https[^\s]*)应该只是让您http://*或https://*.

示例：http ://regex101.com/r/uE5bZ5

score 0 · Accepted Answer

好的，所以这是这个问题的已解决答案，但是https://stackoverflow.com/users/1472389/damien-overeem @Damian Overeem 应该得到所有的功劳，因为我向我展示了 regex101 但这是我如何选择我想要的：

$href = $desc.match(/\b(http|https)?(:\/\/)?(\S*)\.(\w{2,4}(\S*))\b/ig);

在这里查看http://regex101.com/r/gT6hC2

javascript - 如何让 RegEx 获取整个 URL ...从 http 开始并在出现空格之前获取所有内容以及如何排除某些字符

2 回答 2

Related

Reference