我做了一个小程序来读取文件,找到某个字符串,替换它,然后写一个新文件。这是我的代码。
public static void main(String[] args) {
String line;
try {
FileInputStream fstream = new FileInputStream("a.xml");
BufferedInputStream bis = new BufferedInputStream(fstream);
DataInputStream in = new DataInputStream(bis);
Pattern p = Pattern.compile("someregex");
StringBuilder content = new StringBuilder();
while (in.available() != 0) {
line = in.readLine();
Matcher matcher = p.matcher(line);
if (matcher.find()) {
String filtered = matcher.group();
int len = filtered.length() - 8;
String city = filtered.substring(7, len);
line = line.replaceAll("someregex", city);
content.append(line).append("\n");
} else {
content.append(line).append("\n");
}
}
in.close();
BufferedWriter out = new BufferedWriter(new FileWriter("b.xml"));
out.write(content.toString());
out.close();
} catch (Exception e) {
System.err.println("Error: " + e.getMessage());
}
}
问题是该文件包含一些 unicode 字符,而 Java 没有保留它。我有这句话:“可爱的槟城东方和东方之旅”。Java 将其写为“可爱的槟城东方和东方之旅”。如何保留unicode字符?