3

我有一个西班牙语文件,所以它充满了如下字符:

 á é í ó ú ñ Ñ Á É Í Ó Ú 

我必须阅读文件,所以我这样做:

fr = new FileReader(ficheroEntrada);
BufferedReader rEntrada = new BufferedReader(fr);

String linea = rEntrada.readLine();
if (linea == null) {
logger.error("ERROR: Empty file.");
return null;
} 
String delimitador = "[;]";
String[] tokens = null;

List<String> token = new ArrayList<String>();
while ((linea = rEntrada.readLine()) != null) {
    // Some parsing specific to my file. 
    tokens = linea.split(delimitador);
    token.add(tokens[0]);
    token.add(tokens[1]);
}
logger.info("List of tokens: " + token);
return token;

当我阅读令牌列表时,所有特殊字符都消失了,并被这种字符取代:

Ó = Ó
Ñ = Ñ

等等...

发生了什么?我从来没有遇到过字符集问题(我假设是字符集问题)。是因为这台电脑吗?我能做些什么?

任何额外的建议将不胜感激,我正在学习!谢谢!

4

5 回答 5

5

您需要指定相关的字符编码。

BufferedReader rEntrada  = new BufferedReader(
    new InputStreamReader(new FileInputStream(fr), "UTF-8"));
于 2012-11-21T15:00:13.300 回答
4
于 2012-11-21T15:42:28.720 回答
2

您的默认编码错误。您可能需要阅读 UTF8 或 latin1。请参阅此代码段以设置流的编码。另请参见Java,默认编码

public class Program {

    public static void main(String... args)  {

        if (args.length != 2) {
            return ;
        }

        try {
            Reader reader = new InputStreamReader(
                        new FileInputStream(args[0]),"UTF-8");
            BufferedReader fin = new BufferedReader(reader);
            Writer writer = new OutputStreamWriter(
                       new FileOutputStream(args[1]), "UTF-8");
            BufferedWriter fout = new BufferedWriter(writer);
            String s;
            while ((s=fin.readLine())!=null) {
                fout.write(s);
                fout.newLine();
            }

            //Remember to call close. 
            //calling close on a BufferedReader/BufferedWriter 
            // will automatically call close on its underlying stream 
            fin.close();
            fout.close();

        } catch (IOException e) {
            e.printStackTrace();
        }

    }
}
于 2012-11-21T14:59:23.317 回答
2

根据我的经验,应该根据西方编码读取和写入文本文件:ISO-8859-1。

BufferedReader rEntrada = new BufferedReader(new InputStreamReader(new FileInputStream(fr), "ISO-8859-1"));

于 2012-11-21T15:07:45.537 回答
0

其他答案为您提供了正确的方向。只是想用它的Files.newReader(File,Charset)辅助方法添加Guava使得创建这样一个BufferedReader变得非常可读(请原谅双关语):

BufferedReader rEntrada = Files.newReader(new File(ficheroEntrada), Charsets.UTF_8);
于 2012-11-21T15:07:50.133 回答