java - 为什么我在 Windows 中将 \r\r\n 作为换行符而不是 \r\n 作为换行符

Question

我有下面的 readfile() java 函数来读取 .htm 文件

private String readfile(String inputDoc) throws IOException {
    FileInputStream fis = null;
    InputStreamReader isr = null;
    String text = null;
    //open input stream to file
    fis = new FileInputStream(inputDoc);
    isr = new InputStreamReader(fis, "UTF-8");
    StringBuffer buffer = new StringBuffer();
    int c;
    while( (c = isr.read()) != -1 ) {
        buffer.append((char)c);
    }
    text = buffer.toString();
    isr.close();
    return text;
}

这是输入文档的示例片段

<?xml version="1.0" encoding="utf-8"?><html>

<head>

由于某种原因，从 readfile() 返回的文本字符串是<?xml version="1.0" encoding="utf-8"?><html>\r\r\n<head>

但我希望它是 <?xml version="1.0" encoding="utf-8"?><html>\r\n<head>

正如这里概述的那样，windows 中的换行符 \r\n

我在 Windows 7 上的 IntelliJ Idea 中运行了上述函数。（IDEA 默认编码设置为 UTF-8）

有谁知道为什么我从换行的 readfile(String inputDoc) 函数中得到这个奇怪的结果

score 7 · Accepted Answer

当您编写时\n，它会扩展到\r\nWindows 上以实现可移植性。这样，无论您在什么操作系统上运行它，您都可以获得正确的结果，而无需额外的代码：\r\n在 Windows 上或仅\n在 Unix 上。看起来您正在以二进制模式读取输入（在文本模式下，反向发生相同的扩展：\r\n输入中的 any 变为 just \n，因此您再次不必担心操作系统），所以您会看到\r. 然后，当您编写时\n，它会扩展为\r\n，留下两个\rs。

score 3 · Accepted Answer

你得到这个是因为它在输入文件中是这样的。尝试在十六进制编辑器中打开输入文件进行验证。

java - 为什么我在 Windows 中将 \r\r\n 作为换行符而不是 \r\n 作为换行符

2 回答 2

Related

Reference