1

通过执行一些 CLI 实用程序,我有一堆输出信息消息,并且在文件的末尾有一个 Web URL。我需要使用 Python 正则表达式来查找该链接并显示为输出。以下是我为我的目的编写的 3 行代码:

file = str('/root/PycharmProjects/rest_project/sponge_link')

with open(file, 'r') as fo:
    fo.read().__str__()
    urls = re.findall('https?://(?:[-\w.]|(?:%[\da-fA-F]{2}))+', fo)
    print(urls)

以下是文件内容

INFO: Streaming results to http://abc/56659bf3-a66d-482b-80e8-6484cafc650d
INFO: Analyzed target <path/path/path> (73 packages loaded, 10521 targets configured).
INFO: Found 1 target...
Target <path>/dence up-to-date:
 utility-<path>/dence_0.0-5_amd64.deb
 utility-<path>/dence_0.4-5_amd64.changes
INFO: Elapsed time: 23.669s, Critical Path: 0.47s, Remote (0.00% of the time): [queue: 0.00%, setup: 0.00%, process: 0.00%]
INFO: Build Event Protocol files produced successfully.
INFO: Build completed successfully, 1 total action
INFO: Still uploading to http://abc/56659bf3-a66d-482b-80e8-6484cafc650d

但是,当我执行程序时,出现以下错误:

Traceback (most recent call last):
  File "/root/PycharmProjects/rest_project/sel.py", line 24, in <module>
    urls = re.findall('https?://(?:[-\w.]|(?:%[\da-fA-F]{2}))+', fo)
  File "/usr/lib/python3.6/re.py", line 222, in findall
    return _compile(pattern, flags).findall(string)
TypeError: expected string or bytes-like object

它抱怨数据类型应该是字符串。所以,我使用str()了文件路径,但即使这样也不起作用。

4

1 回答 1

1

您正在传递 a file objectto re.findall,而不是 a string。您需要将文件读取的结果分配给一个变量并将其传递给re.findall.

  1. fo.read().__str__()应该是这样的lines = fo.read()
  2. urls = re.findall('https?://(?:[-\w.]|(?:%[\da-fA-F]{2}))+', fo)应该urls = re.findall('https?://(?:[-\w.]|(?:%[\da-fA-F]{2}))+', lines)
于 2019-04-15T18:20:17.153 回答