我正在尝试提取以下列方式格式化的字符串的内容:
<script type="text/javascript">
document.viewData = THE INFORMATION I WANT
</script> some other stuff
关于如何实施的任何想法?
提前致谢!
Your text data:
text = <<-_TEXT_
<script type="text/javascript">
document.viewData = THE INFORMATION I WANT
</script> some other stuff
_TEXT_
Setup a regular expression
re = /document\.viewData = (.*)/
apply it to the text and get the result
result = (text.match re)[1]
print result
require 'nokogiri'
doc = Nokogiri::XML::Document.parse <<-_XML_
<script type="text/javascript">
document.viewData = THE INFORMATION I WANT
</script> some other stuff
_XML_
doc.at('//script').text.strip.split("=").last
# => " THE INFORMATION I WANT"
根据您的严格程度,这可以完成工作(匹配组中的结果):
<script type="text\/javascript">\W+document.viewData =\s+([^<]+)\W+\s+<\/script>