1

我试图弄清楚为什么这个小脚本不会在 Groovy 中执行。

def url = "http://danvega.org/blog/rss.cfm"
def feed = new XmlSlurper().parse(url)

当我尝试运行时,我收到以下错误。

[Fatal Error] index.cfm:39:23: The reference to entity "postID" must end with the ';' delimiter.
Exception thrown

org.xml.sax.SAXParseException: The reference to entity "postID" must end with the ';' delimiter.
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
    at groovy.util.XmlSlurper.parse(XmlSlurper.java:147)
    at groovy.util.XmlSlurper.parse(XmlSlurper.java:213)
    at groovy.util.XmlSlurper$parse$0.call(Unknown Source)
    at ConsoleScript20.run(ConsoleScript20:3)
    at org.springsource.loaded.ri.ReflectiveInterceptor.jlrMethodInvoke(ReflectiveInterceptor.java:1249)

我在 XML 中看不到对 postID 的任何引用,并且我可以在我的 rss 阅读器中使用这个 xml,所以我的结果一定是正确的(和错误的)。有谁知道这会导致什么?

4

1 回答 1

2

For some reason (I guess due to a browser sniff?), http://danvega.org/blog/rss.cfm is redirecting (302 Moved Temporarily) to http://danvega.org/blog/mobile/index.cfm

So the XML reader is choking on the Javascript. I guess it's this line:

            var postID = '';

If you can't fix this server-side, then you can always spoof it in the client:

def spoofedXmlGrab( URL url ) {
    url.openConnection().with { conn ->
        conn.setRequestProperty( 'User-Agent', 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2' )
        conn.inputStream.withReader { ins ->
            new XmlSlurper().parse( ins )
        }
    }
}

def xml = spoofedXmlGrab( 'http://danvega.org/blog/rss.cfm'.toURL() )
xml.channel.item.title.each {
  println it.text()
}
于 2013-09-02T15:38:21.190 回答