32

我有一个字符串,其中包含一个字符 � 我无法正确替换它。

String.replace("�", "");

不起作用,有谁知道如何删除/替换字符串中的�?

4

10 回答 10

40

那是 Unicode 替换字符,\uFFFD。(资料)

像这样的东西应该工作:

String strImport = "For some reason my �double quotes� were lost.";
strImport = strImport.replaceAll("\uFFFD", "\"");
于 2009-09-28T21:49:34.747 回答
17

Character issues like this are difficult to diagnose because information is easily lost through misinterpretation of characters via application bugs, misconfiguration, cut'n'paste, etc.

As I (and apparently others) see it, you've pasted three characters:

codepoint   glyph   escaped    windows-1252    info
=======================================================================
U+00ef      ï       \u00ef     ef,             LATIN_1_SUPPLEMENT, LOWERCASE_LETTER
U+00bf      ¿       \u00bf     bf,             LATIN_1_SUPPLEMENT, OTHER_PUNCTUATION
U+00bd      ½       \u00bd     bd,             LATIN_1_SUPPLEMENT, OTHER_NUMBER

To identify the character, download and run the program from this page. Paste your character into the text field and select the glyph mode; paste the report into your question. It'll help people identify the problematic character.

于 2009-09-28T21:08:00.353 回答
11

您要求替换字符“�”,但对我来说,这是三个字符“ï”、“¿”和“½”。这可能是您的问题...如果您使用的是 Java 1.5 之前的 Java,那么您只能获得 UCS-2 字符,即前 65K UTF-8 字符。根据其他评论,您要查找的字符很可能是“�”,即 Unicode 替换字符。这是“用于替换其值在 Unicode 中未知或无法表示的传入字符”的字符。

实际上,查看 Kathy 的评论,您可能遇到的另一个问题是 javac 没有将您的 .java 文件解释为 UTF-8,假设您使用 UTF-8 编写它。尝试使用:

javac -encoding UTF-8 xx.java

或者,修改您的源代码以执行以下操作:

String.replaceAll("\uFFFD", "");
于 2009-09-28T19:30:14.420 回答
6

正如其他人所说,您发布了 3 个字符而不是 1 个字符。我建议您运行这段代码来查看字符串中的实际内容:

public static void dumpString(String text)
{
    for (int i=0; i < text.length(); i++)
    {
        System.out.println("U+" + Integer.toString(text.charAt(i), 16) 
                           + " " + text.charAt(i));
    }
}

如果您发布结果,将更容易弄清楚发生了什么。(我没有费心填充字符串 - 我们可以通过检查来做到这一点......)

于 2009-09-28T19:38:42.810 回答
1

profilage bas� sur l'analyse de l'esprit(法语)

应译为:

profilage basé sur l'analyse de l'esprit

所以,在这种情况下�=é

于 2019-08-18T15:29:43.283 回答
1

解析时将编码更改为 UTF-8。这将删除特殊字符

于 2015-08-18T05:58:56.860 回答
0

详情

import java.io.UnsupportedEncodingException;

/**
 * File: BOM.java
 * 
 * check if the bom character is present in the given string print the string
 * after skipping the utf-8 bom characters print the string as utf-8 string on a
 * utf-8 console
 */

public class BOM
{
    private final static String BOM_STRING = "Hello World";
    private final static String ISO_ENCODING = "ISO-8859-1";
    private final static String UTF8_ENCODING = "UTF-8";
    private final static int UTF8_BOM_LENGTH = 3;

    public static void main(String[] args) throws UnsupportedEncodingException {
        final byte[] bytes = BOM_STRING.getBytes(ISO_ENCODING);
        if (isUTF8(bytes)) {
            printSkippedBomString(bytes);
            printUTF8String(bytes);
        }
    }

    private static void printSkippedBomString(final byte[] bytes) throws UnsupportedEncodingException {
        int length = bytes.length - UTF8_BOM_LENGTH;
        byte[] barray = new byte[length];
        System.arraycopy(bytes, UTF8_BOM_LENGTH, barray, 0, barray.length);
        System.out.println(new String(barray, ISO_ENCODING));
    }

    private static void printUTF8String(final byte[] bytes) throws UnsupportedEncodingException {
        System.out.println(new String(bytes, UTF8_ENCODING));
    }

    private static boolean isUTF8(byte[] bytes) {
        if ((bytes[0] & 0xFF) == 0xEF && 
            (bytes[1] & 0xFF) == 0xBB && 
            (bytes[2] & 0xFF) == 0xBF) {
            return true;
        }
        return false;
    }
}
于 2015-01-07T07:24:06.080 回答
0

使用unicode 转义序列。首先,您必须找到要替换的字符的代码点(假设它是十六进制的 ABCD):

str = str.replaceAll("\uABCD", "");
于 2009-09-28T19:40:15.857 回答
0

剖析 URL 代码和 unicode 错误。这个符号也出现在谷歌翻译的亚美尼亚文字中,有时是破碎的缅甸文字。

于 2016-05-06T20:11:21.700 回答
-2

以上答案没有解决我的问题。当我下载 xml 时,它会附加<xml到我的 xml 中。我只是

xml = parser.getXmlFromUrl(url);

xml = xml.substring(3);// it remove first three character from string,

现在它正在准确运行。

于 2015-01-07T07:54:41.687 回答