这是关于Objective-C的问题。我编写了使用正则表达式获取整个 HTML 的程序。我已将程序上传到 GitHub。但是,会发生异常。
这个程序的目的是通过正则表达式匹配得到“og:image”。这是通过在 Facebook 中写入 URL 来显示的图像。要设置此图像,请使用 HTML 编写如下:
<meta property="og:image"
content="http://business.nikkeibp.co.jp/article/NBD/20120727/235043/zu1.jpg">
所以我编写了获取整个 HTML 并找到 og:image 部分的程序。代码如下:
// Web page address
NSURL *url = [NSURL URLWithString:textField.text];
// Get the web page HTML
NSString *string =
[NSString stringWithContentsOfURL:url encoding:NSUTF8StringEncoding error:nil];
// prepare regular expression to find text
NSError *error = nil;
NSRegularExpression *regexp =
[NSRegularExpression regularExpressionWithPattern:
@"<meta property=\"og:image\" content=\".+\""
options:0
error:&error];
@try {
// find by regular expression
NSTextCheckingResult *match =
[regexp firstMatchInString:string options:0 range:NSMakeRange(0, string.length)];
// get the first result
NSRange resultRange = [match rangeAtIndex:0];
NSLog(@"match=%@", [string substringWithRange:resultRange]);
if (match) {
// get the og:image URL from the find result
NSRange urlRange = NSMakeRange(resultRange.location + 35, resultRange.length - 35 - 1);
NSURL *urlOgImage = [NSURL URLWithString:[string substringWithRange:urlRange]];
imageView.image = [UIImage imageWithData:[NSData dataWithContentsOfURL:urlOgImage]];
}
}
整个代码在 GitHub 中,如下所示:
https://github.com/weed/p120728_GetOgImage/blob/master/GetOgImage/ViewController.m
但是,有时这个程序会通过异常。
success case:<a href="http://www.nicovideo.jp/watch/1343369790" rel="nofollow">http://www.nicovideo.jp/watch/1343369790
failure case:<a href="http://business.nikkeibp.co.jp/article/NBD/20120727/235043/?ST=pc" rel="nofollow">http://business.nikkeibp.co.jp/article/NBD/20120727/235043/?ST=pc
Screen shots is here: https://github.com/weed/p120728_GetOgImage/blob/master/readme.md
Why exception occurs? Please teach me. Thank you for your help.