arrays - 来自文本文件的 bash curl 数组

Question

我有一个名为 2.txt 的文本文件，其中包含这样的链接

www.link.php/user=1pass=3
www.link.php/user=1pass=3
www.link.php/user=1pass=3
www.link.php/user=1pass=3
www.link.php/user=1pass=3

我想制作一个 curl 命令，逐行访问每个链接并发布我需要的源部分；这是访问其中一个链接时的来源：

 online - Checked user : test cpu cooling rate: 0.50<html>
<head>
</head>
<body>
    <form action="tasks.php" method="get">
        <input type="text" name="account" placeholder="username:password" style="text-    align: center" /> <br />
        <input class="btn btn-success" type="submit" value="Check Account" />
      </form>
</body>

我希望它获取源代码并删除除标签之前的所有 html<html>代码

所以我最终得到了一个像这样的文本文件

online - Checked user : test cpu cooling rate: 0.50
online - Checked user : test cpu cooling rate: 0.520
online - Checked user : test cpu cooling rate: 0.1150
online - Checked user : test cpu cooling rate: 6.50

有人可以帮我吗？

score 2 · Accepted Answer

这个脚本会做你想做的事：

#!/bin/sh

output_file='3.txt'

while read line ; do
  curl "$line" | tr -d '\n' | sed -e :a -e 's/<[^>]*>//g;/</N;//ba' >> "$output_file"
done < '2.txt'

exit 0

感谢Blackbit的正则表达式。

score 0 · Accepted Answer

之前的文本<html>是否总是与标签在同一行？如果是这样，您可以执行以下操作：

#!/bin/bash

cat url_list | while read url; do
  curl "$url" | grep "<html>" | sed 's/<html>.*//'
done

替换cat url_list为您的其他问题的首选解决方案。

arrays - 来自文本文件的 bash curl 数组

2 回答 2

Related

Reference