1

我有包含大量图像和实时内容的 HTML 网页。我需要parse the data from the webpage(HTML)在 iPhone 应用程序中显示。我正在使用以下代码来解析 HTML 内容。但我不知道如何解析标签中的子标签?

{    
    NSURL *url = [NSURL URLWithString:@"http://www.samplewebpage.com/vd/t/1/830.html"]; 

    NSData *data = [[NSData alloc] initWithContentsOfURL:url];
    NSString *responseString = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
    NSLog(@"Response : %@", responseString);

    NSMutableArray *imageURLArray = [[NSMutableArray alloc] init];
    NSMutableArray *divClassArray = [[NSMutableArray alloc] init];
    //NSString *regexStr = @"<A HREF=\"([^>]*)\">";

    // For image : 1. img src=\"([^>]*)\"  2. <img src=\"([^>]*)\">
    // For getting Class div class
    NSString *regularExpressString = @"div class=\"([^>]*)\"";
    NSError *error;
    NSInteger i =0;
    while (i<[responseString length]) 
    {
        NSRegularExpression *testRegularExpress = [NSRegularExpression regularExpressionWithPattern:regularExpressString options:NSRegularExpressionCaseInsensitive error:&error];

        if( testRegularExpress == nil ) 
        {
            NSLog( @"Error making regex: %@", error );
        }

        NSTextCheckingResult *textCheckingResult = [testRegularExpress firstMatchInString:responseString options:0 range:NSMakeRange(i, [responseString length]-i)];
        NSRange range = [textCheckingResult rangeAtIndex:1];
        if (range.location == 0) 
        { 
            break;
        }
        NSString *classNameString = [responseString substringWithRange:range];
        NSLog(@"Div Class Name : %@", classNameString);

        [divClassArray addObject:classNameString];

        i= range.location;
        //NSLog(@"Range.location : %i",range.location);
        i=i+range.length;
    }

    NSLog(@"divClass Array : %@, Count : %d", divClassArray, [divClassArray count]);
}

回复:

<div class="phoneModelItems" style="width:30%;margin-right:4px;"><a href="javascript:noxLatestChart.navToLatestChart('latest');" style="font-weight:italic;">Nokia Model</a></div>

我想从phoneModelItems类中获取文本诺基亚模型。你能告诉我如何检索文本“诺基亚模型”吗?提前致谢。

4

1 回答 1

0

这是我针对您的问题的正则表达式:

<div\sclass=\"phoneModelItems\".*?><a\shref.*?>(.*?)<\/a><\/div>

你可以在Rubular上测试它

于 2012-06-12T12:56:48.343 回答