所以我有以下文本示例:
Good Morning,
The link to your exam is https://uni.edu?hash=89234rw89yfw8fw89ef .Please complete it within the stipulated time.
If you have any issue, please contact us
https://www.uni.edu
https://facebook.com/uniedu
我想要的是提取考试链接的网址:https ://uni.edu?hash=89234rw89yfw8fw89ef 。我打算使用 findAll() 函数,但我很难编写正则表达式来提取特定的 url。
import re
def find_exam_url(text_file):
filename = open(text_file, "r")
new_file = filename.readlines()
word_lst = []
for line in new_file:
exam_url = re.findall('https?://', line) #use regex to extract exam url
return exam_url
if __name__ == "__main__":
print(find_exam_url("mytextfile.txt"))
我得到的输出是:
['http://']
代替:
https://uni.edu?hash=89234rw89yfw8fw89ef
将不胜感激这方面的一些帮助。