0

我无法\u0085在 Java 中修剪 unicode 控制字符。你怎么能做到这一点?

String str = "\u0000\u001f\u0085 hi \n"
PrintStream out = new PrintStream(System.out, true, "UTF-8");
out.println(teststr);
String st = teststr.replaceAll("\\p{Cntrl}", "");
out.println(st);

字符\u0085打印为 ? 并且似乎没有被替换。

4

1 回答 1

1
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;

    public static String trimUtf16(String test) {
        Pattern unicode = Pattern.compile("[^\\x00-\\x7F]",
                Pattern.UNICODE_CASE | Pattern.CANON_EQ
                        | Pattern.CASE_INSENSITIVE);
        Matcher matcher = unicode.matcher(test);
        test = matcher.replaceAll(" ");
        return test;
    }
    System.out.println(trimUtf16("\u0000\u001f\u0085 hi \n"));// hi 
于 2013-05-07T09:49:37.650 回答