0
wget --output-document=- http://www.tip.it/runescape/grand-exchange-centre 2>/dev/null \
| grep "The Grand Exchange updated" \

将输出如下内容:

<h4 id="gec_update_time">The Grand Exchange updated <span><b>1</b> days, <b>12</b> hours, <b>45</b> minutes and <b>1</b> seconds ago</span></h4>

我的目标是修剪它,使其只输出:

1 days, 12 hours, 45 minutes, 1 seconds

不太好用,有什么建议吗?

4

2 回答 2

1

您可以编写一个简短的 Ruby 脚本:

gem install sanitize

制作一个名为“cleaner.rb”的文件:

#!/usr/bin/env ruby -w
require 'rubygems'
require 'sanitize'

puts Sanitize.clean(gets).trim

进而...

wget --output-document=- http://www.tip.it/runescape/grand-exchange-centre 2>/dev/null \ | grep "The Grand Exchange updated" | ./cleaner.rb

给你:“大交易所在 1 天 13 小时 0 分 56 秒前更新”

于 2013-04-09T07:40:13.003 回答
1

如果可以选择使用 lynx,您可以免费获得:

$ lynx -dump http://www.tip.it/runescape/grand-exchange-centre | grep "The Grand Exchange updated"
The Grand Exchange updated 1 days, 19 hours, 8 minutes and 48 seconds ago

如果需要,您可以从那里删除前导文本:

$ foo="$(lynx -dump http://www.tip.it/runescape/grand-exchange-centre | grep "The Grand Exchange updated")"
$ echo "${foo#*updated }"
1 days, 19 hours, 9 minutes and 8 seconds ago

如果你绝对想使用 wget 并去掉标签,你可以使用这样的东西:

$ wget --output-document=- http://www.tip.it/runescape/grand-exchange-centre 2>/dev/null | grep "The Grand Exchange updated" | sed -e 's/<[^>]\+>//g' -e 's/The Grand Exchange updated //'
1 days, 19 hours, 17 minutes and 2 seconds ago

第一个选项可能是更好的选择。

于 2013-04-09T13:49:24.693 回答