61

我在让 unicode 为 git-bash 工作(在 Windows 7 上)时遇到了一些麻烦。我尝试了很多事情都没有成功。虽然,我不太确定对此负责的原因是什么,所以我可能会朝着错误的方向努力。

看起来这应该是可能的,因为 cmd.exe 的编码可以使用 'chcp 65001' 更改为 unicode。

以下是我尝试过的一些事情(除了在 GUI 中查看配置选项很明显)。

  1. 在“.bashrc”中设置环境变量。我想这是有道理的,因为我认为这是一个 linux 的东西。'locale' 命令不存在。

    export LC_ALL=en_US.UTF-8
    export LANG=en_US.UTF-8
    export LANGUAGE=en_US.UTF-8
    
  2. 从 cmd.exe 开始,使用 'chcp 65001' 将编码更改为 unicode,然后启动 git-bash。这导致我在尝试对我的 unicode 测试文件进行分类时获得权限被拒绝。但是,对没有 unicode 的文件进行分类就可以了。如图所示,退回到 cmd.exe 我仍然可以“捕获”该文件。使用我的默认编码(437),我可以在 bash 中对文件进行分类(没有权限被拒绝,但输出是捏造的)。

    S:\>chcp 65001
    Active code page: 65001
    S:\>"C:\Program Files (x86)\Git\bin\sh.exe" --login -i
    zarac@TOWELIE /z
    cat /s/unicode.txt
    cat: write error: Permission denied
    zarac@TOWELIE /z
    cat /s/nounicode.txt
    abc
    zarac@TOWELIE /z
    L /s/unicode.txt
    -rw-r--r--    1 zarac    Administ        7 May 18 10:30 /s/unicode.txt
    zarac@TOWELIE /z
    whoami
    towelie\zarac
    zarac@TOWELIE /z
    exit
    Z:\>type S:\unicode.txt
    abc£
    
  3. 在启动 shell 时使用 /U 标志(它不起作用是有道理的,因为它不完全是 if-i-understand-correctly,但它与 unicode 有关,所以我尝试了它)。

    C:\Windows\SysWOW64\cmd.exe /U /C "C:\Program Files (x86)\Git\bin\sh.exe" --login -i
    
  4. 由于我更喜欢​​使用 Console2,因此我尝试在 [HKEY_CURRENT_USER\Console] 和 [HKEY_CURRENT_USER\Console\Git Bash] 下的 Windows 注册表中添加一个名为 CodePage 且值为 65001(十进制)的 dword 值。这似乎与设置“chcp 65001”具有相同的效果,接受它是“自动的”。(http://stackoverflow.com/questions/379240/is-there-a-windows-command-shell-that-will-display-unicode-characters)

  5. JPSoft的TCC/LE

  6. PowerCMD

  7. 堆栈溢出

  8. 鸭鸭

  9. ixquick / 谷歌

因此,如果可以解决该权限问题,方法 2 似乎是可行的。但是,我对几乎任何解决方案都持开放态度,尽管我更喜欢使用 Console2(主要是因为它具有漂亮的选项卡功能)。也许一种解决方案是设置一个 SSH 服务器,然后使用 Putty/Kitty 连接到它,但这是错误的!; )

PS。git-bash 有官方文档吗?

4

8 回答 8

55

I faced the same issue in MSYS Git 2.8.0 and as it turned out it just needed changing the configuration.

$ git --version

git version 2.8.0.windows.1

The default configuration of Git Bash console in my system did not show Greek filenames.

$cd ~

$ls

AppData/
'Application Data'@
Contacts/
Cookies@
Desktop/
Documents/
Downloads/
Favorites/
Links/
'Local Settings'@
NTUSER.DAT
.
.
.
''$'\316\244\316\261'' '$'\316\255\316\263\316\263\317\201\316\261\317\206\316\254'' '$'\316\274\316\277\317\205'@

The last line should display "Τα έγγραφά μου", the greek translation of "My Documents". In order to fix it I followed the below steps:

  1. Check your existing locale configuration

    $locale
    
    LANG=en
    LC_CTYPE="C"
    LC_NUMERIC="C"
    LC_TIME="C"
    LC_COLLATE="C"
    LC_MONETARY="C"
    LC_MESSAGES="C"
    LC_ALL=
    

    As shown above, in my case it was not UTF-8

  2. Change the locale to a UTF-8 encoding. Click the icon on the left side of MINGW title bar, select "Options" and in the "Text" category choose "UTF-8" Character set. You should also choose a unicode font, such as the default "Lucida Console". My configuration looks as following: MinGW locale configuration

  3. Change the language for the current window (no need to do this on future windows, as they will be created with the settings of step 2)

     $ LANG='C.UTF-8'
    
  4. The ls command should now display properly

    AppData/
    'Application Data'@
    Contacts/
    Cookies@
    Desktop/
    Documents/
    Downloads/
    Favorites/
    Links/
    'Local Settings'@
    NTUSER.DAT
    .
    .
    .
    'Τα έγγραφά μου'@
    
于 2016-04-18T11:22:49.150 回答
17

Found this answer elsewhere:

chcp.com 65001

Git bash chcp windows7 encoding issue

That's what actually solved it for me.

于 2018-05-22T21:08:27.373 回答
10

正如 CharlesB 在评论中所说,msysgit 1.7.10 可以正确处理 unicode。还有一些问题,但我可以确认更新确实解决了我遇到的问题。

请参阅:https ://github.com/msysgit/msysgit/wiki/Git-for-Windows-Unicode-Support

于 2012-05-18T15:58:30.057 回答
6

检查 Git 2.1(2014 年 8 月)问题是否仍然存在。
请参阅Karsten Blees的提交 617ce96提交1c950a5 ( kblees)

Win32:支持 Unicode 控制台输出

WriteConsoleW似乎是可靠地将 unicode 打印到控制台的唯一方法(没有奇怪的代码页转换)。

还重定向vfprintfwinansi.c版本。

Win32:添加Unicode转换功能

添加 Unicode 转换函数以在 Windows 原生 UTF-16LE 编码和 UTF-8 之间进行转换。

为了支持具有传统编码文件名的存储库,UTF-8 到 UTF-16 转换功能会尝试创建有效的、唯一的文件名,即使是无效的 UTF-8 字节序列,以便可以无错误地签出这些存储库。

它很可能是已经集成到 msysgit 中的东西的一个端口,但至少这意味着 Git 的 Windows 版本不必为了包含这些改进而从主要的 Git 存储库源代码中分离/修补。

于 2014-08-02T19:27:58.157 回答
5
于 2014-08-24T21:53:00.953 回答
2

For me the solution was just to enable unicode support.
Docs: https://github.com/msysgit/msysgit/wiki/Git-for-Windows-Unicode-Support

git config --global core.quotepath off

于 2021-11-02T11:39:23.227 回答
1

I found the following steps helpful:

  1. Run Git Bash
  2. Right-click and select Options...
  3. Select Text group at the left
  4. Change Font to Consolas
  5. Select C as Locale and UTF-8 as Character set
  6. Apply and Save.

Git Bash Options

  1. In the terminal execute:
git config --global core.quotepath false
  1. In rare cases, execute in the terminal as well:
export LANG='C.UTF-8'
于 2021-10-30T05:03:42.270 回答
0

The problem with chcp 65001 is that there are bugs in the C runtime (MSVCRT) that make stdio calls return inconsistent results when run under code page 65001.

That should be better with Git 2.23 (Q3 2019)

See commit 090d1e8 (03 Jul 2019) by Karsten Blees (kblees).
(Merged by Junio C Hamano -- gitster -- in commit 0328db0, 11 Jul 2019)

gettext: always use UTF-8 on native Windows

On native Windows, Git exclusively uses UTF-8 for console output (both with MinTTY and native Win32 Console).

Gettext uses setlocale() to determine the output encoding for translated text, however, MSVCRT's setlocale() does not support UTF-8. As a result, translated text is encoded in system encoding (as per GetAPC()), and non-ASCII chars are mangled in console output.

Side note: There is actually a code page for UTF-8: 65001.
In practice, it does not work as expected at least on Windows 7, though, so we cannot use it in Git. Besides, if we overrode the code page, any process spawned from Git would inherit that code page (as opposed to the code page configured for the current user), which would quite possibly break e.g. diff or merge helpers. So we really cannot override the code page.

In init_gettext_charset(), Git calls gettext's bind_textdomain_codeset() with the character set obtained via locale_charset(); Let's override that latter function to force the encoding to UTF-8 on native Windows.

In Git for Windows' SDK, there is a libcharset.h and therefore we define HAVE_LIBCHARSET_H in the MINGW-specific section in config.mak.uname, therefore we need to add the override before that conditionally-compiled code block.

Rather than simply defining locale_charset() to return the string "UTF-8", though, we are careful not to break LC_ALL=C: the ab/no-kwset patch series, for example, needs to have a way to prevent Git from expecting UTF-8-encoded input.

And:

See commit 697bdd2 (04 Jul 2019), and commit 9423885, commit 39a98e9 (27 Jun 2019) by Johannes Schindelin (dscho).
(Merged by Junio C Hamano -- gitster -- in commit 0a2ff7c, 11 Jul 2019)

mingw: use Unicode functions explicitly

Many Win32 API functions actually exist in two variants: one with the A suffix that takes ANSI parameters (char * or const char *) and one with the W suffix that takes Unicode parameters (wchar_t * or const wchar_t *).

The ANSI variant assumes that the strings are encoded according to whatever is the current locale.
This is not what Git wants to use on Windows: we assume that char * variables point to strings encoded in UTF-8.

There is a pseudo UTF-8 locale on Windows, but it does not work as one might expect. In addition, if we overrode the user's locale, that would modify the behavior of programs spawned by Git (such as editors, difftools, etc), therefore we cannot use that pseudo locale.

Further, it is actually highly encouraged to use the Unicode versions instead of the ANSI versions, so let's do precisely that.

Note: when calling the Win32 API functions without any suffix, it depends whether the UNICODE constant is defined before the relevant headers are #include'd.
Without that constant, the ANSI variants are used.
Let's be explicit and avoid that ambiguity.

于 2019-07-18T20:26:13.050 回答