bash - Pass external variable to xidel in bash loop script

Question

I try to parse html page using XPath with xidel. The page have a table with multiple rows and columns I need to get values from each row from columns 2 and 5 (IP and port) and store them in csv-like file. Here is my script

#!/bin/bash
for (( i = 2; i <= 100; i++ ))
do
xidel http://www.vpngate.net/en/ -e '//*[@id="vg_hosts_table_id"]/tbody/tr["'$i'"]/td[2]/span[1]' >> "$i".txt #get value from first column
xidel http://www.vpngate.net/en/ -e '//*[@id="vg_hosts_table_id"]/tbody/tr["'$i'"]/td[5]' >> "$i".txt #get value from second column
sed -i ':a;N;$!ba;s/\n/^/g' "$i".txt #replace newline with custom delimiter
sed -i '/\s/d' "$i".txt #remove blanks
cat "$i".txt >> ip_port_list #create list
zip -m ips.zip "$i".txt #archive unneeded texts
done

The perfomance is not issue When i manually increment each tr - looks perfect. But not with variable from loop. I want to receive a pair of values from each row. Now i got only partial data or even empty file

score 2 · Accepted Answer

我需要从第 2 列和第 5 列（IP 和端口）的每一行中获取值，并将它们存储在类似 csv 的文件中。

xidel -s "https://www.vpngate.net/en/" -e '
  (//table[@id="vg_hosts_table_id"])[3]//tr[not(td[@class="vg_table_header"])]/concat(
    td[2]/span[@style="font-size: 10pt;"],
    ",",
    extract(
      td[5],
      "TCP: (\d+)",
      1
    )
  )
'
220.218.70.177,443
211.58.36.54,995
1.239.223.190,1351
[...]
153.207.18.229,1542

(//table[@id="vg_hosts_table_id"])[3]：选择同类中的第三个表。你想要的那个。
//tr[not(td[@class="vg_table_header"])]：选择除标题之外的所有行。
td[2]/span[@style="font-size: 10pt;"]：选择第 2 列和<span>仅包含 IP 地址的列。
extract(td[5],"TCP: (\d+)",1)：选择第 5 列并提取（正则表达式）之后的数值"TCP "。

score 0 · Accepted Answer

也许这条 xidel 行会派上用场：

xidel -q http://www.vpngate.net/en/ -e '//*[@id="vg_hosts_table_id"]/tbody/tr[*]/concat(td[2]/span[1],",",substring-after(substring-before(td[5],"UDP:"),"TCP: "))'

这只会做一次提取（所以 vpngate 的管理员不会阻止你），它还会创建一个 CSV 输出（ip，port）......希望这就是你想要的吗？

bash - Pass external variable to xidel in bash loop script

2 回答 2

Related

Reference