regex - 在 Sublime Text 中，如何查找所有未编码的网址并将其替换为格式化和链接的 URL？

Question

如何使用格式化和链接的 URL 查找和替换所有未编码的 Web 地址？

下面示例中的虚拟文本可以表示不同长度的段落。

例子：

`之前：

   Dummy text. website.dk/info
   Dummy text (website.com) Dummy text.
   Dummy text. website.dk
   Dummy text. www.website.com

后：

   Dummy text. <em><a href="http://website.dk/info" target="blank">website.dk/info</a></em>
   Dummy text (<em><a href="http://website.com" target="blank">website.com</a></em>) dummy text.
   Dummy text. <em><a href="http://website.dk" target="blank">website.dk</a></em>
   Dummy text. <em><a href="http://website.com" target="blank">www.website.com</a></em>`

score 1 · Accepted Answer

假设“之前”只是一个 URL 列表：

查找 > 替换...
单击.*以启用正则表达式
进入(.+)“查找内容”
输入<em><a href="http://\1" target="blank">\1</a></em>“替换为”
点击“全部替换”

如果“之前”不是所有 URL，那么“查找内容”将更加棘手。

根据评论，这是一种（hacky）Python 方法。

文件.html

<html>
<body>
  <p>
    Dummy text. website.dk/info
    Dummy text (website.com) Dummy text.
    Dummy text. website.dk
    Dummy text. www.website.com
  </p>
  <p>
    Dummy text. <em><a href="http://website.dk/info" target="blank">website.dk/info</a></em>
    Dummy text (<em><a href="http://website.com" target="blank">website.com</a></em>) dummy text.
    Dummy text. <em><a href="http://website.dk" target="blank">website.dk</a></em>
    Dummy text. <em><a href="http://website.com" target="blank">www.website.com</a></em>
  </p>
</body>
</html>

链接链接.py

import re;

def link_links(m):
  # Link all links.
  return re.sub(
    # Experiment with this pattern; e.g., search for "URL regex".
    r'(?<=\W)((?:www\.)?\w+\.\w+(?:\/\S+)*)',
    '<em><a href="http://\\1" target="blank">\1</a></em>',
    m.group(0)
  )

with open("file.html", "r") as html:
  match_non_html_re = re.compile(r'''
    (?<=>) # After a closing HTML tag
    [^<]+ # Match all non-HTML
    (?=<) # Ensure it is followed by an opening HTML tag (since we cannot use atomic grouping)
    (?!<\/a>) # Ensure we were not within a link tag already
  ''', re.VERBOSE)
  print re.sub(match_non_html_re, link_links, html.read())

score 1 · Accepted Answer

假设您有更多不是链接的文本，那么您可以使用这样的正则表达式：

((?:www\.)?\w+\.?\w+\/?\w+)

有了这个替换字符串

<em><a href="http://$1" target="blank">$1</a></em>

工作演示

regex - 在 Sublime Text 中，如何查找所有未编码的网址并将其替换为格式化和链接的 URL？

2 回答 2

文件.html

链接链接.py

Related

Reference