6 回答
You can make use of Pattern
and Matcher
class here. You can put all the filtered character in a character class, and use Matcher#find()
method to check whether your pattern is available in string or not.
You can do it like this: -
public boolean containsIllegals(String toExamine) {
Pattern pattern = Pattern.compile("[~#@*+%{}<>\\[\\]|\"\\_^]");
Matcher matcher = pattern.matcher(toExamine);
return matcher.find();
}
find()
method will return true, if the given pattern is found in the string, even once.
Another way that has not yet been pointed out is using String#split(regex)
. We can split the string on the given pattern, and check the length of the array. If length is 1
, then the pattern was not in the string.
public boolean containsIllegals(String toExamine) {
String[] arr = toExamine.split("[~#@*+%{}<>\\[\\]|\"\\_^]", 2);
return arr.length > 1;
}
If arr.length > 1
, that means the string contained one of the character in the pattern, that is why it was splitted. I have passed limit = 2
as second parameter to split
, because we are ok with just single split.
I need the method to scan every character in the string
If you must do it character-by-character, regexp is probably not a good way to go. However, since all characters on your "blacklist" have codes less than 128, you can do it with a small boolean
array:
static final boolean blacklist[] = new boolean[128];
static {
// Unassigned elements of the array are set to false
blacklist[(int)'~'] = true;
blacklist[(int)'#'] = true;
blacklist[(int)'@'] = true;
blacklist[(int)'*'] = true;
blacklist[(int)'+'] = true;
...
}
static isBad(char ch) {
return (ch < 128) && blacklist[(int)ch];
}
Use a constant for avoids recompile the regex in every validation.
private static final Pattern INVALID_CHARS_PATTERN =
Pattern.compile("^.*[~#@*+%{}<>\\[\\]|\"\\_].*$");
And change your code to:
public boolean containsIllegals(String toExamine) {
return INVALID_CHARS_PATTERN.matcher(toExamine).matches();
}
This is the most efficient way with Regex.
If you can't use a matcher, then you can do something like this, which is cleaner than a bunch of different if statements or a byte array.
for(int i = 0; i < toExamine.length(); i++) {
char c = toExamine.charAt(i);
if("~#@*+%{}<>[]|\"_^".contains(c)){
return true;
}
}
Try the negation of a character class containing all the blacklisted characters:
public boolean containsIllegals(String toExamine) {
return toExamine.matches("[^~#@*+%{}<>\\[\\]|\"\\_^]*");
}
This will return true
if the string contains illegals (your original function seemed to return false
in that case).
The caret ^
just to the right of the opening bracket [
negates the character class. Note that in String.matches()
you don't need the anchors ^
and $
because it automatically matches the whole string.
A pretty compact way of doing this would be to rely on the String.replaceAll
method:
public boolean containsIllegal(final String toExamine) {
return toExamine.length() != toExamine.replaceAll(
"[~#@*+%{}<>\\[\\]|\"\\_^]", "").length();
}