我有这段代码,我需要在“href=”之前为整数添加一个正则表达式:
f = File.open("us.html")
doc = Nokogiri::HTML(f)
ans = doc.css('a[href=]')
puts doc
我试着做:
ans = doc.css('a[href=\d]
或者:
ans = doc.css('a[href="\d"])
但它不起作用。任何人都可以提出解决方法吗?
If you want to use a regular expression, I believe you will have to do that manually. It cannot be done with a CSS or XPath selector.
You can do it by iterating through the elements and comparing their href
attribute to your regular expression. For example:
html = %q{
<html>
<a href='1'></a>
<a href='adf'></a>
</html>
}
doc = Nokogiri::HTML(html)
ans = doc.css('a[href]').select{ |e| e['href'] =~ /\d/}
#=> <a href="1"></a>
您可以在 XPath 中执行此操作:
require 'nokogiri'
html = %q{
<html>
<a href='1'></a>
<a href='adf'></a>
</html>
}
doc = Nokogiri::HTML(html)
puts doc.xpath('//a[@href[number(.) = .]]')
#=> <a href="1"></a>
XPath 函数number()
将转换为数字。如果它等于节点本身,则该节点是一个数字。甚至可以使用不等式运算符检查范围。