您将不得不使用以下查询之一单独处理不间断 空间:
-e "//p[@class='price']/span/bdi/substring-before(text(),' ')"
-e "//p[@class='price']/span/bdi/translate(text(),x:cps(160),'')"
-e "//p[@class='price']/span/bdi/replace(text(),' ','')"
不能用normalize-space()
,因为...
https://www.w3.org/TR/xpath-functions-31/#func-normalize-space:
[Extensible Markup Language (XML) 1.1 Recommendation]中空白的定义没有改变。为方便起见,在此重复:
S ::= (#x20 | #x9 | #xD | #xA)+
...它处理空格、制表符、回车和换行,但不处理不间断空格:
xidel -s "<x> test </x>" -e "x'[{x}]'"
[ test ]
xidel -s "<x> test </x>" -e "x'[{normalize-space(x)}]'"
[test]
xidel -s "<x> test </x>" -e "x'[{x}]'"
[ test ]
xidel -s "<x> test </x>" -e "x'[{normalize-space(x)}]'"
[ test ]
xidel -s "<x> test </x>" -e "x'[{translate(x,' ','')}]'"
xidel -s "<x> test </x>" -e "x'[{replace(x,x:cps(160),'')}]'"
xidel -s "<x> test </x>" -e "x'[{replace(x,' ','')}]'"
[test]
顺便说一句,在该网站上获取价格的替代方法:
xidel -s "https://kenzel.sk/produkt/bicykle/zivotny-styl/signora/" -e ^"^
parse-json(^
//body/script[@type='application/ld+json']^
)//priceSpecification/price^
"
304.00