问题标签 [file-encodings]

问问题

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

131 问题

0 投票

1 回答

796 浏览

c# - C#：解析 Lating1Encoded 文档时 Xpath 中的“瑞典语”字符

我有一组需要解析的 html 文档。它们以 Latin1Encoded 编码。我正在使用 HtmlAgiliy 包进行“解析”。

我有一个 Xpath 查询（带有瑞典字符），由于文档和编码之间的编码不同，我无法工作 VS 将 XPath 查询存储在？

Xpath 查询：

xpath 查询在 Firefox 扩展 xpath 检查器中运行良好。

c#xpath latin1 file-encodings

2009-05-12T07:22:22.633

0 投票

1 回答

6076 浏览

java - 如何在 ant 中为 junit 测试设置 file.encoding？

我还没有完成file.encoding 和 ant。如何为ant中的junit测试设置file.encoding ？junit ant 任务不像javac任务那样支持encoding属性。

我试过运行 «ant -Dfile.encoding=UTF-8» 和 «ANT_OPTS="-Dfile.encoding=UTF-8" ant» 没有成功。测试中的 System.getProperty("file.encoding") 仍然返回MacRoman。

java ant junit file-encodings

2009-11-06T08:46:15.600

0 投票

5 回答

9556 浏览

delphi - 如何将 Delphi IDE 中的默认文件格式设置为 UTF8？

Delphi 2009 将新源代码文件的默认文件格式设置为 ANSI，这使得源代码依赖于平台。

即使是在 IDE 中创建的新 XSD 文件，默认情况下以该行开头

Delphi 将文件格式设置为 ANSI（这看起来像一个错误，对于新的 XML 和 XSLT 文档，默认选择 UTF8）。

是否有隐藏选项来设置源代码文件的默认文件格式？

delphi utf-8 delphi-2009 file-encodings

2009-11-15T00:21:56.927

0 投票

3 回答

30813 浏览

java - Linux上的Java字符集问题

问题：我有一个包含特殊字符的字符串，我将其转换为字节，反之亦然。转换在 Windows 上正常工作，但在 linux 上，特殊字符未正确转换。Linux 上的默认字符集是 UTF-8，如 Charset 所示。 defaultCharset.getdisplayName()

但是，如果我使用选项 -Dfile.encoding=ISO-8859-1 在 linux 上运行，它可以正常工作..

如何使用 UTF-8 默认字符集而不是在 unix 环境中设置 -D 选项使其工作。

编辑：我使用 jdk1.6.13

编辑：代码片段适用于 cs = "ISO-8859-1"; 或 cs="UTF-8"; 在赢但不在linux中

〜问候daed

java character-encoding file-encodings

2010-01-30T15:22:56.493

0 投票

3 回答

232 浏览

perl - Perl and reading files with different encodings

I am using a perl script to read in a file, but I'm not sure what encoding the file is in. Basically, my file is a list of book titles, but each book has other info associated with it (author, publication date, etc). So each book title is within a discrete chunk of data for the book. So I iterate through the file line by line until I find the regular expression '/Book Title: (.*)/' and take what's in the paren. Then, I create a separate .txt file with the name of the text file being my book. However, in my unix server, when I look at the name of the file, it's actually not, for example, 'LordOfTheFlies.txt' but rather 'LordOfTheFlies^M.txt'

What is this '^M'? Is that a weird end of line encoding I'm not taking into account? I tried chomp but it doesn't seem to be working. What is the best file encoding for working with perl?

perl input file-encodings

2010-03-01T07:35:27.103

0 投票

1 回答

823 浏览

ruby - 使用 ruby 进行文件编码

我在文件编码方面遇到了一些问题。

我收到一个 url 编码的字符串，如“sometext%C3%B3+more+%26+andmore”，取消转义，处理数据，并使用 windows-1252 编码保存。

转换如下：

结果应该是sometextó more & andmore

ruby file-encodings

2010-05-28T12:17:48.647

0 投票

6 回答

10583 浏览

java - File.listFiles() 使用 JDK 6 破坏 unicode 名称（Unicode 规范化问题）

在 OS X 和 Linux 上的 Java 6 中列出目录内容时，我遇到了一个奇怪的文件名编码问题：File.listFiles()和相关的方法似乎返回的文件名与系统其余部分的编码不同。

请注意，导致我出现问题的不仅仅是这些文件名的显示。我主要对文件名与远程文件存储系统的比较感兴趣，所以我更关心名称字符串的内容，而不是用于打印输出的字符编码。

这是一个演示程序。它创建一个具有 Unicode 名称的文件，然后打印出从直接创建的文件中获得的文件名的URL 编码版本，以及列在父目录下的相同文件（您应该在空目录中运行此代码）。结果显示该File.listFiles()方法返回的不同编码。

这是我在系统上运行此测试代码时得到的结果。注意%CC与%C3字符表示。

OS X 雪豹：

KUbuntu Linux（在同一 OS X 系统上的 VM 中运行）：

我尝试了各种技巧来让字符串达成一致，包括设置file.encoding系统属性和各种LC_CTYPE环境LANG变量。没有什么帮助，我也不想诉诸这样的黑客。

与这个（有点相关？）问题不同，尽管名称很奇怪，我仍然能够从列出的文件中读取数据

java unicode normalization unicode-normalization file-encodings

2010-08-31T14:32:36.740

0 投票

1 回答

724 浏览

ruby - 文件编码在 ruby 中生成空白字符——为什么？

我正在使用一点点红宝石：

而且我有一个示例文件，我在文件中提供的文件只包含三个句点和一个换行符。

当我使用 utf-8 的文件编码（在 vim: 中set fileencoding=utf-8）保存此文件并在其上运行此脚本时，我得到以下输出：

然后，如果我将文件编码更改为 latin1（在 vim: 中set fileencoding=latin1）并运行脚本，我不会得到第一个空白字符：

这里发生了什么？我知道 utf8 编码在文件的开头放置了一些字节以将文件标记为 utf8 编码，但我认为在处理文本时它们应该是不可见的（即：ruby 运行时应该处理它们）。我错过了什么？

顺便提一句：

谢谢！

更新：

带有额外字符（BOM）的文件的十六进制转储：

ruby vim utf-8 file-encodings

2010-10-01T06:04:24.457

0 投票

10 回答

68055 浏览

c# - StreamWriter 和 UTF-8 字节顺序标记

我遇到了 StreamWriter 和字节顺序标记的问题。该文档似乎声明 Encoding.UTF8 编码启用了字节顺序标记，但是在写入文件时，有些有标记，而另一些则没有。

我正在通过以下方式创建流编写器：

任何关于可能发生的事情的想法将不胜感激。

c#file-encodings

2011-03-10T21:21:11.987

0 投票

1 回答

21101 浏览

powershell - Powershell：获取默认系统编码

powershell cmdletout-file具有-encoding您可以设置为default. 此默认值将使用系统当前 ANSI 代码页的编码。
我的问题是：如何获取out-file将与 powershell 一起使用的默认编码的名称？

powershell file-encodings

2011-03-16T13:52:24.057

1 2 3 4 5 6 7 8 9 10

问题标签 [file-encodings]

Reference