问问题
1439 次
1 回答
5
The regex \p{Punct}
only matches US-ASCII punctuation by default, unless you enable Unicode character classes. That means that your code, as written, would only remove these characters:
!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
If you want to match everything the Unicode Consortium classified as punctuation, try \p{IsPunctuation}
instead, which always checks Unicode character properties and matches all the punctiuation in your example (and more!).
To replace whitespace as well as punctuation, like in your example, you would use:
line = line.replaceAll("\\p{IsPunctuation}|\\p{IsWhite_Space}", "");
于 2017-11-18T13:52:42.597 回答