这是使用正则表达式查找子字符串的示例。它查找 "href=",然后查找 href= 之后的第一个引号 (")。一旦找到这些索引,就会返回 then 之间的字符串。
在我的示例中并不真正需要正则表达式,您可以使用简单的 NSString 方法来查找子字符串。
这只是一个适合您特定情况的硬编码示例。在实践中,您最好使用 DOM/XML 解析器来执行此类操作。
另外,我假设您要提取实际的 URL,而不关心
另请注意,此函数不处理字符串中没有 href 匹配的情况。
- (NSString *)stringByExtractingAnchorTagURLFromString:(NSString *)dom {
NSError *error;
// Find the "href=" part
NSRegularExpression *firstRegexp = [NSRegularExpression regularExpressionWithPattern:@"href=\"" options:NSRegularExpressionCaseInsensitive error:&error];
NSTextCheckingResult *firstResult = [firstRegexp firstMatchInString:dom options:NSMatchingReportProgress range:NSMakeRange(0, [dom length])];
NSUInteger startIndex = firstResult.range.location + firstResult.range.length;
// Find the first quote (") character after the href=
NSRegularExpression *secondRegexp = [NSRegularExpression regularExpressionWithPattern:@"\"" options:NSRegularExpressionCaseInsensitive error:&error];
NSTextCheckingResult *secondResult = [secondRegexp firstMatchInString:dom options:NSMatchingReportProgress range:NSMakeRange(startIndex, [dom length]-startIndex)];
NSUInteger endIndex = secondResult.range.location;
// The URL is the string between these two found locations
return [dom substringWithRange:NSMakeRange(startIndex, endIndex-startIndex)];
}
这就是我测试它的方式:
NSString *dom = @"<div style=\"clear:both;\"></div><div style=\"float:left;\"><div style=\"float:left; height:27px; font-size:13px; padding-top:2px;\"><div style=\"float:left;\"><a href=\"http://www.hulkshare.com/ap-nxy2n2wn7ke8.mp3\" rel=\"nofollow\" target=\"_blank\" style=\"color:green;\">Download</a></div>";
NSString *result = [self stringByExtractingAnchorTagURLFromString:dom];
NSLog(@"Result: %@", result);
测试打印:
Result: http://www.hulkshare.com/ap-nxy2n2wn7ke8.mp3
更新——多个 HREF
对于多个 href,请使用此函数,它将返回一个包含 url 的 NSString 数组:
- (NSArray *)anchorTagURLsFromString:(NSString *)dom {
NSError *error;
NSMutableArray *urls = [NSMutableArray array];
// First find all matching hrefs in the dom
NSRegularExpression *firstRegexp = [NSRegularExpression regularExpressionWithPattern:@"href=\"" options:NSRegularExpressionCaseInsensitive error:&error];
NSArray *matches = [firstRegexp matchesInString:dom options:NSMatchingReportProgress range:NSMakeRange(0, [dom length])];
// Go through all matches and extrac the URL
for (NSTextCheckingResult *match in matches) {
NSUInteger startIndex = match.range.location + match.range.length;
// Find the first quote (") character after the href=
NSRegularExpression *secondRegexp = [NSRegularExpression regularExpressionWithPattern:@"\"" options:NSRegularExpressionCaseInsensitive error:&error];
NSTextCheckingResult *secondResult = [secondRegexp firstMatchInString:dom options:NSMatchingReportProgress range:NSMakeRange(startIndex, [dom length]-startIndex)];
NSUInteger endIndex = secondResult.range.location;
[urls addObject:[dom substringWithRange:NSMakeRange(startIndex, endIndex-startIndex)]];
}
return urls;
}
这就是我测试它的方式:
NSString *dom2 = @"<div style=\"clear:both;\"></div><div style=\"float:left;\"><div style=\"float:left; height:27px; font-size:13px; padding-top:2px;\"><div style=\"float:left;\"><a href=\"http://www.hulkshare.com/ap-nxy2n2wn7ke8.mp3\" rel=\"nofollow\" target=\"_blank\" style=\"color:green;\">Download</a><a href=\"http://www.google.com/blabla\" rel=\"nofollow\" target=\"_blank\" style=\"color:green;\">Download</a></div>";
NSArray *urls = [self anchorTagURLsFromString:dom2];
for (NSString *url in urls) {
NSLog(@"URL: %@", url);
}
这是测试的输出:
URL: http://www.hulkshare.com/ap-nxy2n2wn7ke8.mp3
URL: http://www.google.com/blabla