2

我在使用 Mule 中的 RSS 提要拆分器将 RSS 提要从某些网站拆分为单个 SyndEntry 对象时遇到问题。我已经从我的 Mule 应用程序中包含了 xml 代码。我还包括在 3 个不同的提要上运行我的 Mule 应用程序的 3 个不同实例的日志(第一个有效,接下来的两个给出错误)。不要担心底部第一个提要的输出,但我将其包含在其中以表明某些提要确实可以转换(拆分)为 SyndEntry 对象,随后可以从这些对象中提取元素(通过我的 EntryReceiver在我的 Mule 应用程序中定义的类)。然而,对于我尝试拆分的大多数其他提要,我最终得到了相同的错误。如果可以,请你帮助我。

注意:我还在末尾包含了一个 EntryReceiver 类的片段,它的函数接受一个 SyndEntry 对象,但是给出错误的提要从一开始就不会进入函数。

骡子申请代码...

 <http:polling-connector name="PollingHttpConnector" pollingFrequency="60000" doc:name="HTTP Polling" clientSoTimeout="10000" cookieSpec="netscape"  receiveBacklog="0" receiveBufferSize="0" sendBufferSize="0" serverSoTimeout="10000" socketSoLinger="0" validateConnections="true"/>
    <flow name="myFlow" doc:name="myFlow">
        <http:inbound-endpoint exchange-pattern="one-way"   doc:name="HTTP" address="http://www.nasa.gov/rss/dyn/breaking_news.rss" connector-ref="PollingHttpConnector"/>
        <rss:feed-splitter/>
        <rss:entry-last-updated-filter/>
        <component class="path.to.EntryReceiver" doc:name="Java"/>
        <logger message="#[payload]" level="INFO" doc:name="Logger"/>
        <http:outbound-endpoint exchange-pattern="request-response" host="${my.host}" port="${my.port}" path="api/v1/activity" method="POST" mimeType="application/json" doc:name="HTTP"/>
        <logger message="#[payload]" level="INFO" doc:name="Logger"/>
    </flow>

RossMason 饲料(工作)...来自http://rossmason.blogspot.com/feeds/posts/default

org.mule.DefaultMuleMessage
{
  id=279b9b1a-e979-11e2-ba08-0d15b2b474eb
  payload=org.mule.transport.http.ReleasingInputStream
  correlationId=<not set>
  correlationGroup=-1
  correlationSeq=-1
  encoding=UTF-8
  exceptionPayload=<not set>

Message properties:
  INVOCATION scoped properties:
  INBOUND scoped properties:
    Cache-Control=private, max-age=0
    Connection=false
    Content-Type=application/atom+xml; charset=UTF-8
    Date=Wed, 10 Jul 2013 15:55:33 GMT
    ETag=W/"CUUBQXs6cCl7ImA9WhJbFkQ."
    Expires=Wed, 10 Jul 2013 15:55:33 GMT
    Keep-Alive=false
    Last-Modified=Wed, 26 Sep 2012 21:00:50 GMT
    MULE_ORIGINATING_ENDPOINT=endpoint.http.rossmason.blogspot.com.feeds.posts.default
    Server=GSE
    Transfer-Encoding=chunked
    X-Content-Type-Options=nosniff
    X-XSS-Protection=1; mode=block
    http.headers={ETag=W/"CUUBQXs6cCl7ImA9WhJbFkQ.", X-XSS-Protection=1; mode=block, Expires=Wed, 10 Jul 2013 15:55:33 GMT, Last-Modified=Wed, 26 Sep 2012 21:00:50 GMT, Connection=false, Server=GSE, X-Content-Type-Options=nosniff, Cache-Control=private, max-age=0, Transfer-Encoding=chunked, Date=Wed, 10 Jul 2013 15:55:33 GMT, Keep-Alive=false, Content-Type=application/atom+xml; charset=UTF-8}
    http.method=GET
    http.query.params={}
    http.query.string=
    http.request=http://rossmason.blogspot.com/feeds/posts/default
    http.status=200
    http.version=HTTP/1.1
  OUTBOUND scoped properties:
    MULE_ENCODING=UTF-8
}

org.mule.api.processor.LoggerMessageProcessor: { "priority" : 0.1 , "actor" : { "objectType" : "person" , "id" : "Ross Mason" , "displayName" : "Strategy is Something You Can Only Learn"} , "verb" : "post" , "object" : { "url" : "tag:blogger.com,1999:blog-1425601518852438157.post-6101490854635104494" , "objectType" : "notification" , "author" : "Ross Mason" , "content" : { "text" : "There is a great post by Mike Cannon-Brookes of Altassian, which talks about how they came up with the stellar business strategy that has that has dri..." , "url" : "tag:blogger.com,1999:blog-1425601518852438157.post-6101490854635104494" , "urlLinkName" : "Complete News Article"}} , "published" : { "$date" : "2008-12-12T13:22:00.000Z"} , "target" : "public"}

NASA 饲料(不工作)...来自http://www.nasa.gov/rss/dyn/break_news.rss

org.mule.DefaultMuleMessage
{
  id=f33625a5-e97a-11e2-aeec-ff1013508cd8
  payload=org.mule.transport.http.ReleasingInputStream
  correlationId=<not set>
  correlationGroup=-1
  correlationSeq=-1
  encoding=utf-8
  exceptionPayload=<not set>

Message properties:
  INVOCATION scoped properties:
  INBOUND scoped properties:
    Cache-Control=no-cache, must-revalidate, post-check=0, pre-check=0
    Connection=true
    Content-Language=en
    Content-Length=6831
    Content-Type=application/rss+xml; charset=utf-8
    Date=Wed, 10 Jul 2013 16:08:07 GMT
    ETag="1373472427"
    Expires=Sun, 19 Nov 1978 05:00:00 GMT
    Keep-Alive=true
    Last-Modified=Wed, 10 Jul 2013 16:07:07 +0000
    MULE_ORIGINATING_ENDPOINT=endpoint.http.www.nasa.gov.rss.dyn.breaking.news.rss
    Server=nginx/1.4.1
    Vary=Accept-Encoding
    Via=1.0 690ebb4ae180f02f630cd90d73b6bc50.cloudfront.net (CloudFront)
    X-Amz-Cf-Id=nyC1SZiI_0AKB7EUZ1S6w53_TGV3BrsDlY0vpxPpQ0xLXv4KKGRn-g==
    X-Cache=RefreshHit from cloudfront
    X-Powered-By=PHP/5.3.10-1ubuntu3.6
    http.headers={Content-Language=en, ETag="1373472427", Content-Length=6831, Expires=Sun, 19 Nov 1978 05:00:00 GMT, Last-Modified=Wed, 10 Jul 2013 16:07:07 +0000, X-Amz-Cf-Id=nyC1SZiI_0AKB7EUZ1S6w53_TGV3BrsDlY0vpxPpQ0xLXv4KKGRn-g==, Connection=true, Server=nginx/1.4.1, X-Powered-By=PHP/5.3.10-1ubuntu3.6, X-Cache=RefreshHit from cloudfront, Cache-Control=no-cache, must-revalidate, post-check=0, pre-check=0, Date=Wed, 10 Jul 2013 16:08:07 GMT, Vary=Accept-Encoding, Keep-Alive=true, Via=1.0 690ebb4ae180f02f630cd90d73b6bc50.cloudfront.net (CloudFront), Content-Type=application/rss+xml; charset=utf-8}
    http.method=GET
    http.query.params={}
    http.query.string=
    http.request=http://www.nasa.gov/rss/dyn/breaking_news.rss
    http.status=200
    http.version=HTTP/1.0
  OUTBOUND scoped properties:
    MULE_ENCODING=utf-8

********************************************************************************
Message               : null (java.lang.NullPointerException). Message payload is of type: ReleasingInputStream
Code                  : MULE_ERROR--2
--------------------------------------------------------------------------------
Exception stack is:
1. null (java.lang.NullPointerException)
  org.mule.module.rss.routing.FeedSplitter$EntryComparator:121 (null)
2. null (java.lang.NullPointerException). Message payload is of type: ReleasingInputStream (org.mule.api.MessagingException)
  org.mule.execution.ExceptionToMessagingExceptionExecutionInterceptor:35 (http://www.mulesoft.org/docs/site/current3/apidocs/org/mule/api/MessagingException.html)
--------------------------------------------------------------------------------
Root Exception stack trace:
java.lang.NullPointerException
    at org.mule.module.rss.routing.FeedSplitter$EntryComparator.compare(FeedSplitter.java:121)
    at org.mule.module.rss.routing.FeedSplitter$EntryComparator.compare(FeedSplitter.java:117)
    at java.util.TreeMap.put(TreeMap.java:530)
    + 3 more (set debug level logging or '-Dmule.verbose.exceptions=true' for everything)
********************************************************************************

org.mule.api.processor.LoggerMessageProcessor: 
org.mule.DefaultMuleMessage
{
  id=16f3ed76-e97b-11e2-aeec-ff1013508cd8
  payload=org.mule.transport.http.ReleasingInputStream
  correlationId=<not set>
  correlationGroup=-1
  correlationSeq=-1
  encoding=utf-8
  exceptionPayload=<not set>

Message properties:
  INVOCATION scoped properties:
  INBOUND scoped properties:
    Cache-Control=no-cache, must-revalidate, post-check=0, pre-check=0
    Connection=true
    Content-Encoding=gzip
    Content-Language=en
    Content-Length=2118
    Content-Type=application/rss+xml; charset=utf-8
    Date=Wed, 10 Jul 2013 16:09:13 GMT
    ETag="1373472468"
    Expires=Sun, 19 Nov 1978 05:00:00 GMT
    Keep-Alive=true
    Last-Modified=Wed, 10 Jul 2013 16:07:48 +0000
    MULE_ORIGINATING_ENDPOINT=endpoint.http.www.nasa.gov.rss.dyn.breaking.news.rss
    Server=nginx/1.4.1
    Vary=Accept-Encoding
    Via=1.0 0fc90446797ad54b123cb72bd8f4f142.cloudfront.net (CloudFront)
    X-Amz-Cf-Id=83X6Bc8fZjAU98-991DN8rnruZDRNw9O_N_W0eXI7iHv7Z7N5aurZg==
    X-Cache=Miss from cloudfront
    X-Powered-By=PHP/5.3.10-1ubuntu3.6
    http.headers={Content-Language=en, ETag="1373472468", Content-Length=2118, Expires=Sun, 19 Nov 1978 05:00:00 GMT, Last-Modified=Wed, 10 Jul 2013 16:07:48 +0000, X-Amz-Cf-Id=83X6Bc8fZjAU98-991DN8rnruZDRNw9O_N_W0eXI7iHv7Z7N5aurZg==, Connection=true, X-Powered-By=PHP/5.3.10-1ubuntu3.6, Server=nginx/1.4.1, X-Cache=Miss from cloudfront, Cache-Control=no-cache, must-revalidate, post-check=0, pre-check=0, Date=Wed, 10 Jul 2013 16:09:13 GMT, Vary=Accept-Encoding, Content-Encoding=gzip, Keep-Alive=true, Via=1.0 0fc90446797ad54b123cb72bd8f4f142.cloudfront.net (CloudFront), Content-Type=application/rss+xml; charset=utf-8}
    http.method=GET
    http.query.params={}
    http.query.string=
    http.request=http://www.nasa.gov/rss/dyn/breaking_news.rss
    http.status=200
    http.version=HTTP/1.0
  OUTBOUND scoped properties:
    MULE_ENCODING=utf-8org.mule.exception.DefaultMessagingExceptionStrategy: 
********************************************************************************
Message               : Invalid XML: Error on line 1: Content is not allowed in prolog. (com.sun.syndication.io.ParsingFeedException)
Code                  : MULE_ERROR--2
--------------------------------------------------------------------------------
Exception stack is:
1. Content is not allowed in prolog. (org.xml.sax.SAXParseException)
  org.apache.xerces.util.ErrorHandlerWrapper:-1 (null)
2. Error on line 1: Content is not allowed in prolog. (org.jdom.input.JDOMParseException)
  org.jdom.input.SAXBuilder:533 (null)
3. Invalid XML: Error on line 1: Content is not allowed in prolog. (com.sun.syndication.io.ParsingFeedException)
  com.sun.syndication.io.WireFeedInput:182 (null)
4. Invalid XML: Error on line 1: Content is not allowed in prolog. (com.sun.syndication.io.ParsingFeedException) (org.mule.api.transformer.TransformerException)
  org.mule.module.rss.transformers.ObjectToRssFeed:85 (http://www.mulesoft.org/docs/site/current3/apidocs/org/mule/api/transformer/TransformerException.html)
--------------------------------------------------------------------------------
Root Exception stack trace:
org.xml.sax.SAXParseException: Content is not allowed in prolog.
    at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
    at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
    + 3 more (set debug level logging or '-Dmule.verbose.exceptions=true' for everything)

ESPN 饲料(不工作)...

from... http://search.espn.go.com/rss/bill-simmons/

org.mule.api.processor.LoggerMessageProcessor: 
org.mule.DefaultMuleMessage
{
  id=32be248e-e97a-11e2-9817-fb41b80baff3
  payload=org.mule.transport.http.ReleasingInputStream
  correlationId=<not set>
  correlationGroup=-1
  correlationSeq=-1
  encoding=iso-8859-1
  exceptionPayload=<not set>

Message properties:
  INVOCATION scoped properties:
  INBOUND scoped properties:
    Cache-Control=max-age=600
    Connection=true
    Content-Type=text/xml; charset=iso-8859-1
    Date=Wed, 10 Jul 2013 16:03:03 GMT
    Keep-Alive=true
    MULE_ORIGINATING_ENDPOINT=endpoint.http.search.espn.go.com.rss.bill.simmons
    P3P=CP="CAO DSP COR CURa ADMa DEVa TAIa PSAa PSDa IVAi IVDi CONi OUR SAMo OTRo BUS PHY ONL UNI PUR COM NAV INT DEM CNT STA PRE"
    Set-Cookie=[Lorg.apache.commons.httpclient.Cookie;@b07c615
    Transfer-Encoding=chunked
    http.headers={Transfer-Encoding=chunked, Date=Wed, 10 Jul 2013 16:03:03 GMT, P3P=CP="CAO DSP COR CURa ADMa DEVa TAIa PSAa PSDa IVAi IVDi CONi OUR SAMo OTRo BUS PHY ONL UNI PUR COM NAV INT DEM CNT STA PRE", Keep-Alive=true, Set-Cookie=[Lorg.apache.commons.httpclient.Cookie;@b07c615, Connection=true, Content-Type=text/xml; charset=iso-8859-1, Cache-Control=max-age=600}
    http.method=GET
    http.query.params={}
    http.query.string=
    http.request=http://search.espn.go.com/rss/bill-simmons/
    http.status=200
    http.version=HTTP/1.1
  OUTBOUND scoped properties:
    MULE_ENCODING=iso-8859-1

org.mule.exception.DefaultMessagingExceptionStrategy: 
********************************************************************************
Message               : null (java.lang.NullPointerException). Message payload is of type: ReleasingInputStream
Code                  : MULE_ERROR--2
--------------------------------------------------------------------------------
Exception stack is:
1. null (java.lang.NullPointerException)
  org.mule.module.rss.routing.FeedSplitter$EntryComparator:121 (null)
2. null (java.lang.NullPointerException). Message payload is of type: ReleasingInputStream (org.mule.api.MessagingException)
  org.mule.execution.ExceptionToMessagingExceptionExecutionInterceptor:35 (http://www.mulesoft.org/docs/site/current3/apidocs/org/mule/api/MessagingException.html)
--------------------------------------------------------------------------------
Root Exception stack trace:
java.lang.NullPointerException
    at org.mule.module.rss.routing.FeedSplitter$EntryComparator.compare(FeedSplitter.java:121)
    at org.mule.module.rss.routing.FeedSplitter$EntryComparator.compare(FeedSplitter.java:117)
    at java.util.TreeMap.put(TreeMap.java:530)
    + 3 more (set debug level logging or '-Dmule.verbose.exceptions=true' for everything)
********************************************************************************

EntryReceiver 类片段...

public class EntryReceiver {

    public String readEntry(@Payload SyndEntry entry) throws Exception
    {

        char[] descCharArray = entry.getDescription().getValue().toString().toCharArray();

        String summary = "";

        int contentLength = entry.getDescription().getValue().toString().length();

        for (int i = 0; i < contentLength; i++){
            if(i == 150) break;
            else{
                summary = summary + descCharArray[i];
            }
        }

        summary = summary.replaceAll("\\<[^>]*>","");



        BasicDBObject activity = new BasicDBObject();       
        activity.put("priority", new Double(0.1) );

        // ACTOR
        BasicDBObject actor = new BasicDBObject();
        actor.put("objectType", "person");
        actor.put("id", entry.getAuthor());
        actor.put("displayName", entry.getTitle());


        activity.put("actor", actor);
        activity.put("verb", "post");

        // OBJECT
        BasicDBObject object = new BasicDBObject();
        object.put("url", entry.getUri());

        object.put("objectType", "notification");
        object.put("author", entry.getAuthor());

        activity.put("object", object);


        activity.put("published", entry.getPublishedDate());
        activity.put("target", "public");

        return activity.toString();
    }

}
4

1 回答 1

1

发生这种情况是因为后面的两个提要不遵循 RSS 标准 (RFC 822) 为元素指定的日期格式标准。由于这个 Mule 的 FeedSplitter 无法解析这些日期。

有关标准 RSS 模式,请参见http://cyber.law.harvard.edu/rss/rss.html

RFC 822 在这里:http ://www.ietf.org/rfc/rfc0822.txt

于 2013-09-02T08:02:37.570 回答