unicode 字符在文本文件中具有不同的表示形式(没有 \u)。
用于评估
int main()
{
// Write
{
std::string s = "\u00C1 M\u00F3ti S\u00F3l";
std::ofstream out("/tmp/test.txt");
out << s;
}
// Read Text
{
std::string s;
std::ifstream in("/tmp/test.txt");
std::getline(in, s);
std::cout << "Result: " << s << std::endl;
}
// Read Binary
{
std::ifstream in("/tmp/test.txt");
in.unsetf(std::ios_base::skipws);
std::istream_iterator<unsigned char> first(in);
std::istream_iterator<unsigned char> last;
std::vector<unsigned char> v(first, last);
std::cout << "Result: ";
for(unsigned c: v) std::cout << std::hex << c << ' ';
std::cout << std::endl;
}
return 0;
}
在带有 UTF8 的 Linux 上: 结果:Á Móti Sól 结果:c3 81 20 4d c3 b3 74 69 20 53 c3 b3 6c