2

I want one function 'to lower' (from word) to work correctly on two languages, for example, english and russian. What should I do? Should I use std::wstring for it, or I can go along with std::string? Also I want it to be cross-platform and don't reinvent the wheel.

4

2 回答 2

6

此类事物的规范库是 ICU:

http://site.icu-project.org/

还有一个 boost 包装器:

http://www.boost.org/doc/libs/1_55_0/libs/locale/doc/html/index.html

另请参阅此问题: Is there an STL and UTF-8 friendly C++ Wrapper for ICU, or other strong Unicode library

首先确保您了解语言环境的概念,并且牢牢掌握 Unicode 和更普遍的编码系统的全部内容。

一些不错的快速入门读物:

http://joelonsoftware.com/articles/Unicode.html

http://en.wikipedia.org/wiki/Locale

于 2014-04-24T19:16:05.993 回答
0

我认为这个解决方案是可以的。我不确定它是否适合所有情况,但很有可能。

#include <locale>
#include <codecvt>
#include <string>

std::string toLowerCase (const std::string& word) {
    std::wstring_convert<std::codecvt_utf8<wchar_t> > conv;
    std::locale loc("en_US.UTF-8");
    std::wstring wword = conv.from_bytes(word);
    for (int i = 0; i < wword.length(); ++i) {
       wword[i] = std::tolower(word[i], loc);
    }
   return conv.to_bytes(wword);
}
于 2014-04-26T13:56:15.600 回答