您可能要考虑搜索输入字符串并不总是像您期望的那样干净,并且可能包含标点符号、括号等。
您还想对口音松懈。
我喜欢使用正则表达式来解决这类问题,并且由于您正在寻找一种允许对搜索词进行任意排序的解决方案,因此我们需要重新处理搜索字符串。我们也可以为此使用正则表达式——因此模式是通过正则表达式替换构造的,这不符合原则。您可能需要彻底记录它。
所以这里有一个代码片段可以做这些事情:
// Use the Posix locale as the lowest common denominator of locales to
// remove accents.
NSLocale *enLoc = [[NSLocale alloc] initWithLocaleIdentifier: @"en_US_POSIX"];
// Mixed bag of genres, but for testing purposes we get all the accents we need
NSString *orgString = @"Beyoncé Motörhead Händel";
// Clean string by removing accents and upper case letters in Posix encoding
NSString *string = [orgString stringByFoldingWithOptions: NSCaseInsensitiveSearch | NSDiacriticInsensitiveSearch
locale: enLoc ];
// What the user has typed in, with misplaced umlaut and all
NSString *orgSearchString = @"handel, mötorhead, beyonce";
// Clean the search string, too
NSString *searchString = [orgSearchString stringByFoldingWithOptions: NSCaseInsensitiveSearch | NSDiacriticInsensitiveSearch | NSWidthInsensitiveSearch
locale: enLoc ];
// Turn the search string into a regex pattern.
// Create a pattern that looks like: "(?=.*handel)(?=.*motorhead)(?=.*beyonce)"
// This pattern uses positive lookahead to create an AND logic that will
// accept arbitrary ordering of the words in the pattern.
// The \b expression matches a word boundary, so gets rid of punctuation, etc.
// We use a regex to create the regex pattern.
NSString *regexifyPattern = @"(?w)(\\W*)(\\b.+?\\b)(\\W*)";
NSString *pattern = [searchString stringByReplacingOccurrencesOfString: regexifyPattern
withString: @"(?=.*$2)"
options: NSRegularExpressionSearch
range: NSMakeRange(0, searchString.length) ];
NSError *error;
NSRegularExpression *anyOrderRegEx = [NSRegularExpression regularExpressionWithPattern: pattern
options: 0
error: &error];
if ( !anyOrderRegEx ) {
// Regex patterns are tricky, programmatically constructed ones even more.
// So we check if it went well and do something intelligent if it didn't
// ...
}
// Match the constructed pattern with the string
NSUInteger numberOfMatches = [anyOrderRegEx numberOfMatchesInString: string
options: 0
range: NSMakeRange(0, string.length)];
BOOL found = (numberOfMatches > 0);
在Apple的这份技术说明中讨论了 Posix 语言环境标识符的使用。
理论上,如果用户输入对正则表达式具有特殊含义的字符,则此处存在边缘情况,但由于第一个正则表达式删除了非单词字符,因此应该以这种方式解决。有点计划外的积极副作用,因此可能值得验证。
如果您对基于正则表达式的解决方案不感兴趣,代码折叠对于“正常”的基于 NSString 的搜索可能仍然有用。