emacs - 如何设置 shell-command-on-region 输出的编码？

Question

我有一个小的 elisp 脚本，它在区域或整个文件上应用 Perl::Tidy。作为参考，这是脚本（从 EmacsWiki 借来的）：

(defun perltidy-command(start end)
"The perltidy command we pass markers to."
(shell-command-on-region start 
                       end 
                       "perltidy" 
                       t
                       t
                       (get-buffer-create "*Perltidy Output*")))

(defun perltidy-dwim (arg)
"Perltidy a region of the entire buffer"
(interactive "P")
(let ((point (point)) (start) (end))
(if (and mark-active transient-mark-mode)
    (setq start (region-beginning)
          end (region-end))
  (setq start (point-min)
        end (point-max)))
(perltidy-command start end)
(goto-char point)))

(global-set-key "\C-ct" 'perltidy-dwim)

我正在使用当前的 Emacs 23.1 for Windows (EmacsW32)。我遇到的问题是，如果我在 UTF-8 编码文件（状态栏中的“U（Unix）”）上应用该脚本，输出将返回 Latin-1 编码，即每个非编码的两个或多个字符ASCII 源字符。

有什么办法可以解决吗？

编辑：问题似乎通过(set-terminal-coding-system 'utf-8-unix)在我的init.el. 如果有人有其他解决方案，请继续编写它们！

score 3 · Accepted Answer

以下来自shell-command-on-region文档

To specify a coding system for converting non-ASCII characters
in the input and output to the shell command, use C-x RET c
before this command.  By default, the input (from the current buffer)
is encoded using coding-system specified by `process-coding-system-alist',
falling back to `default-process-coding-system' if no match for COMMAND
is found in `process-coding-system-alist'.

在执行过程中，首先查找编码系统process-coding-system-alist，如果为nil，则查找default-process-coding-system。

如果您想更改编码，可以将转换选项添加到process-coding-system-alist，下面是它的内容。

Value: (("\\.dz\\'" no-conversion . no-conversion)
 ...
("\\.elc\\'" . utf-8-emacs)
("\\.utf\\(-8\\)?\\'" . utf-8)
("\\.xml\\'" . xml-find-file-coding-system)
 ...
("" undecided))

或者，如果你没有设置process-coding-system-alist，它是 nil，你可以将你的编码选项分配给default-process-coding-system，

例如：

(setq default-process-coding-system '(utf-8 . utf-8))

（如果输入编码为utf-8，则输出编码为utf-8）

或者

(setq default-process-coding-system '(undecided-unix . iso-latin-1-unix))

如果您想了解详细信息，我还写了一篇关于此的帖子。

score 2 · Accepted Answer

引用shell-command-on-region( C-h f shell-command-on-region RET) 的文档：

要指定将输入和输出中的非 ASCII 字符转换为 shell 命令的编码系统，请在此命令之前使用 Cx RET c。默认情况下，输入（来自当前缓冲区）在用于保存文件的相同编码系统中进行编码，`buffer-file-coding-system'。如果输出要替换该区域，则从相同的编码系统对其进行解码。

非交互式参数是 START、END、COMMAND、OUTPUT-BUFFER、REPLACE、ERROR-BUFFER 和 DISPLAY-ERROR-BUFFER。非交互式调用者可以通过绑定“coding-system-for-read”和“coding-system-for-write”来指定编码系统。

换句话说，你会做类似的事情

(let ((coding-system-for-read 'utf-8-unix))
  (shell-command-on-region ...) )

这是未经测试的，不确定在您的情况下应该是什么值coding-system-for-read（或者可能是？或者也是？）。-write我想你也可以使用 OUTPUT-BUFFER 参数并将输出定向到一个缓冲区，该缓冲区的编码系统设置为你需要的。

另一种选择可能是在 perltidy 调用中调整语言环境，但同样，没有更多关于您现在使用的信息的信息，也没有办法在类似于您的系统上进行试验，我只能暗示。

emacs - 如何设置 shell-command-on-region 输出的编码？

2 回答 2

Related

Reference