maven - maven file.encoding 和 Charset.defaultCharset()

Question

我的 Maven 父 POM 包含

<file.encoding>UTF-8</file.encoding>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>

我有一个包含以下代码的 JUnit-Test：

byte[] bytes;
System.out.println("------------------" + System.getProperty("file.encoding"));
try {
    bytes = "ü".getBytes(); // german umlaut u - two bytes in utf-8 one byte in latin-1
    System.out.println("Byte count: " + bytes.length);
    for (int i = 0; i < bytes.length; i++) {
        System.out.println(String.format("%02x", bytes[i]));
    }
} catch (Exception e) {
    e.printStackTrace();
}
System.out.println("------------------" + Charset.defaultCharset());

当我运行 mvn clean test （在我的 Windows 机器上，默认字符集为 Cp1252）时，输出是

------------------Cp1252
Byte count: 1
fc
------------------windows-1252

当我运行 mvn -Dfile.encoding=UTF-8 clean test 时，输出为：

------------------UTF-8
Byte count: 1
fc
------------------windows-1252

现在我有两个问题：

1) 我的 POM 中的 <file.encoding> 属性有什么用？

2）当我指定 -Dfile.encoding=UTF-8 时，为什么默认字符集没有更改为 UTF-8（因此 getBytes() 仍然使用 'cp1252' 并返回 1 个字节），我该如何更改

提前致谢，

罗纳德

score 0 · Accepted Answer

编辑器也必须设置相同的编码。显然您将文件保存在 Cp1252 中。使用 JEdit 或 NotePad++ 进行检查。

getBytes("UTF-8"); // 2
getBytes("Cp1252"); // 1
getBytes(); // Depending on platform, System.getProperty("file.encoding")

maven 对这些属性做了什么，我不完全确定file.encoding.

score 0 · Accepted Answer

如果你想让 Charset.defaultCharset 返回 UTF-8，你还需要为插件 argLine 设置它，因为如果你只在属性中指定它就太晚了。

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-surefire-plugin</artifactId>
    <version>2.19.1</version>
    <configuration>
      <skipTests>${skip.unit.tests}</skipTests>
      <enableAssertions>true</enableAssertions>
      <argLine>${surefireArgLine} -Dfile.encoding=UTF-8</argLine>
    </configuration>
  </plugin>

maven - maven file.encoding 和 Charset.defaultCharset()

2 回答 2

Related

Reference