3

我是 iOS 开发的新手,此时我已经实现了 NSXMLparser ,但我真的不知道如何分隔具有相同名称但内容不同的标签,例如<description>. 在某些提要中,此标签只有摘要,而在其他提要中,包含我也想提取的“img src”。(有或没有 CDATA)

Example of description tags wich i need to grab the images and then pass to my UIImageView:

<description><![CDATA[ <p>Roger Craig Smith and Troy Baker to play Batman and the Joker respectively in upcoming action game; Deathstroke confirmed as playable character. </p><p><img src="http://image.com.com/gamespot/images/2013/139/ArkhamOrigins_29971_thumb.jpg"

<description>&lt;img src=&quot;http://cdn.gsmarena.com/vv/newsimg/13/05/samsung-galaxy-s4-active-photos/thumb.jpg&quot; width=&quot;70&quot; height=&quot;92&quot; hspace=&quot;3&quot; alt=&quot;&quot; border=&quot;0&quot; align=left style="background:#333333;padding:0px;margin:0px 4px 0px 0px;border-style:solid;border-color:#aaaaaa;border-width:1px" /&gt; &lt;p&gt;

我认为@Rob示例解决了我的情况,但我不知道如何在我的 NSXMLParser 中包含(如下所述)来分离数据和图像。我只能获取此解析器上的数据(摘要)。

我的 NSXMLParser:

- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qualifiedName attributes:(NSDictionary *)attributeDict
{
element = [elementName copy];


if ([elementName isEqualToString:@"item"])
{
    elements = [[NSMutableDictionary alloc] init];
    title = [[NSMutableString alloc] init];
    date = [[NSMutableString alloc] init];
    summary = [[NSMutableString alloc] init];
    link = [[NSMutableString alloc] init];
    img = [[NSMutableString alloc] init];
    imageLink = [[NSMutableString alloc]init];

}

if([elementName isEqualToString:@"media:thumbnail"]) {
    NSLog(@"thumbnails media:thumbnail: %@", attributeDict);
    imageLink = [attributeDict objectForKey:@"url"];
}

if([elementName isEqualToString:@"media:content"]) {
    NSLog(@"thumbnails media:content: %@", attributeDict);
    imageLink = [attributeDict objectForKey:@"url"];

}

if([elementName isEqualToString:@"enclosure"]) {
    NSLog(@"thumbnails Enclosure %@", attributeDict);
    imageLink = [attributeDict objectForKey:@"url"];
}

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string
{
if ([element isEqualToString:@"title"])
{
    [title appendString:string];
}
else if ([element isEqualToString:@"pubDate"])
{
    [date appendString:string];
}
else if ([element isEqualToString:@"description"])
{
    [summary appendString:string];

}
   else if ([element isEqualToString:@"media:description"])
{
    [summary appendString:string];

}
else if ([element isEqualToString:@"link"])
{
    [link appendString:string];
}
else if ([element isEqualToString:@"url"]) {

    [imageLink appendString:string];
}
else if ([element isEqualToString:@"src"]) {

    [imageLink appendString:string];
}
else if ([element isEqualToString:@"content:encoded"]){
    NSString *imgString = [self getImage:string];
    if (imgString != nil) {
        [img appendString:imgString];
        NSLog(@"Content of img:%@", img);
    }

}

-(NSString *) getImage:(NSString *)htmlString {
NSString *url = nil;

NSScanner *theScanner = [NSScanner scannerWithString:htmlString];

[theScanner scanUpToString:@"<img" intoString:nil];
if (![theScanner isAtEnd]) {
    [theScanner scanUpToString:@"src" intoString:nil];
    NSCharacterSet *charset = [NSCharacterSet characterSetWithCharactersInString:@"\"'"];
    [theScanner scanUpToCharactersFromSet:charset intoString:nil];
    [theScanner scanCharactersFromSet:charset intoString:nil];
    [theScanner scanUpToCharactersFromSet:charset intoString:&url];

}
return url;
}

@end
4

1 回答 1

2

在您的示例中,您只有两个description元素,每个元素都img嵌入了标签。您只需解析description类似的正常内容,然后提取img标签(使用正则表达式、使用 my retrieveImageSourceTagsViaRegexbelow 或扫描仪)。

请注意,如果您不希望,您不必以不同的方式处理 theCDATA和 non -renditions。CDATA虽然NSXMLParserDelegate提供了一个foundCDATA例程,但我实际上倾向于实现它。在没有 a 的情况下foundCDATA,标准foundCharacters例程将优雅地无缝处理您的标签NSXMLParser的两种再现(有和没有)。descriptionCDATA

考虑以下假设的 XML:

<xml>
    <descriptions>
        <description><![CDATA[ <p>Roger Craig Smith and Troy Baker to play Batman and the Joker respectively in upcoming action game; Deathstroke confirmed as playable character. </p><p><img src="http://image.com.com/gamespot/images/2013/139/ArkhamOrigins_29971_thumb.jpg">]]></description>
        <description>&lt;img src=&quot;http://cdn.gsmarena.com/vv/newsimg/13/05/samsung-galaxy-s4-active-photos/thumb.jpg&quot; width=&quot;70&quot; height=&quot;92&quot; hspace=&quot;3&quot; alt=&quot;&quot; border=&quot;0&quot; align=left style="background:#333333;padding:0px;margin:0px 4px 0px 0px;border-style:solid;border-color:#aaaaaa;border-width:1px" /&gt; &lt;p&gt;</description>
    </descriptions>
</xml>

以下解析器将解析这两个description条目,从中获取图像 URL。正如您将看到的,不需要特殊处理CDATA

@interface ViewController () <NSXMLParserDelegate>

@property (nonatomic, strong) NSMutableString *description;
@property (nonatomic, strong) NSMutableArray *results;

@end

@implementation ViewController

- (void)viewDidLoad
{
    [super viewDidLoad];
    // Do any additional setup after loading the view, typically from a nib.

    NSURL *filename = [[NSBundle mainBundle] URLForResource:@"test" withExtension:@"xml"];
    NSXMLParser *parser = [[NSXMLParser alloc] initWithContentsOfURL:filename];
    parser.delegate = self;
    [parser parse];

    // full array of dictionary entries

    NSLog(@"results = %@", self.results);
}

- (NSMutableArray *)retrieveImageSourceTagsViaRegex:(NSString *)string
{
    NSError *error = NULL;
    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"(<img\\s[\\s\\S]*?src\\s*?=\\s*?['\"](.*?)['\"][\\s\\S]*?>)+?"
                                                                           options:NSRegularExpressionCaseInsensitive
                                                                             error:&error];

    NSMutableArray *results = [NSMutableArray array];

    [regex enumerateMatchesInString:string
                            options:0
                              range:NSMakeRange(0, [string length])
                         usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {

                             [results addObject:[string substringWithRange:[result rangeAtIndex:2]]];
                         }];

    return results;
}

#pragma mark - NSXMLParserDelegate

- (void)parserDidStartDocument:(NSXMLParser *)parser
{
    self.results = [NSMutableArray array];
}

- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict
{
    if ([elementName isEqualToString:@"description"])
        self.description = [NSMutableString string];
}

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string
{
    if (self.description)
        [self.description appendString:string];
}

- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
{
    if ([elementName isEqualToString:@"description"])
    {
        NSArray *imgTags = [self retrieveImageSourceTagsViaRegex:self.description];
        NSDictionary *result = @{@"description": self.description, @"imgs" : imgTags};
        [self.results addObject:result];
        self.description = nil;
    }
}

@end

这会产生以下结果(注意, no CDATA​​):

results = (
        {
        description = " <p>Roger Craig Smith and Troy Baker to play Batman and the Joker respectively in upcoming action game; Deathstroke confirmed as playable character. </p><p><img src=\"http://image.com.com/gamespot/images/2013/139/ArkhamOrigins_29971_thumb.jpg\">";
        imgs =         (
            "http://image.com.com/gamespot/images/2013/139/ArkhamOrigins_29971_thumb.jpg"
        );
    },
        {
        description = "<img src=\"http://cdn.gsmarena.com/vv/newsimg/13/05/samsung-galaxy-s4-active-photos/thumb.jpg\" width=\"70\" height=\"92\" hspace=\"3\" alt=\"\" border=\"0\" align=left style=\"background:#333333;padding:0px;margin:0px 4px 0px 0px;border-style:solid;border-color:#aaaaaa;border-width:1px\" /> <p>";
        imgs =         (
            "http://cdn.gsmarena.com/vv/newsimg/13/05/samsung-galaxy-s4-active-photos/thumb.jpg"
        );
    }
)

因此,最重要的是,只需像平常一样解析 XML,不用担心,只需使用您认为合适的orCDATA解析出图像 URL 。NSScannerNSRegularExpression

于 2013-05-21T00:12:22.920 回答