1

tikzDevice 在 UTF-8 的 Windows 下不输出带有 Umlauts 的代码

我使用 RMarkdown 编写报告并使用 tikzDevice 进行绘图。当我使用德语变音符号 (äöüÖÄÜ) 时,RStudio 会抛出以下错误:

pandoc.exe:无法解码字节“\xd6”:Data.Text.Internal.Encoding.streamDecodeUtf8With:无效的 UTF-8 流

这是一个最小的例子:

---
title: "test"
author: "test"
date: "Today"
output: 
  pdf_document: 
    keep_tex: true
header-includes:
   - \usepackage{tikz}
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
library(tikzDevice)
options(tikzDefaultEngine = "xetex")
```
```{r plot, dev="tikz", external=FALSE}
x <- rnorm(50)
y <- rnorm(50)

plot(x, y, xlab = "ÖÄÜ", ylab = "öäü")
```

使用此代码,tikzDevice 使用 1252 编码写入 TeX 文件(绘图),当包含在主 LaTeX 文档中时,该文件不起作用。因此 Pandoc 抛出一个错误。我在 Ubuntu 下尝试过,代码有效。我怀疑 Windows 编码是导致此问题的原因,但我无法找到解决方案。

源文件 (Rmd) 采用 UTF-8 编码。生成的 TeX 文件(由 tikzDevice)不是 UTF-8 编码。

会话信息(Windows):

version  R version 3.6.1 (2019-07-05)
os       Windows 10 x64
system   x86_64, mingw32
ui       RStudio
language (EN)
collate  German_Germany.1252
ctype    German_Germany.1252
tz       Europe/Berlin
date     2019-09-04 

会话信息(Ubuntu):

version  R version 3.4.4 (2018-03-15)
os       Ubuntu 18.04.3 LTS
system   x86_64, linux-gnu
ui       X11
language (EN)
collate  C.UTF-8
ctype    C.UTF-8
tz       Europe/Berlin
date     2019-09-04
4

3 回答 3

1

另一种解决方法是转换图形文件夹中的所有 tikz/tex 文件。使用iconv文件内容会从 CP1252 转换为 UTF-8。如果这是文档中的最后一个块,则不需要“硬编码”元音变音:

# path of the Rmd file
path <- getwd()
# subfolder of the cache and figures
subfolder <- paste(gsub(knitr::current_input(), pattern = ".Rmd", replacement = ""), "_files", sep = "")
# beamer or latex figures
figures <- ifelse(dir.exists(paste(path, subfolder, "figure-latex", sep = "/")), "figure-latex", ifelse(dir.exists(paste(path, subfolder, "figure-beamer", sep = "/")), "figure-beamer", ""))
# full path of the figure folder
folder <- paste(path, subfolder, figures, sep = "/")
# find all tex/tikz files in the figures folder
for (x in list.files(folder, pattern = "*.tex")) {
  # full path to file
  file <- paste(folder, "/", x, sep = "")
  # full path to temp file
  temp <- paste(folder, "/", "temp.tex", sep = "")
  # rename source file to temp
  file.rename(file, temp)
  # read input file in correct encoding
  input <- readLines(temp, encoding = "cp1252")
  # convert input to UTF-8
  output <- iconv(input, from = "cp1252", to = "UTF8")
  # write output with original filename
  writeLines(input, con = file(file, encoding = "UTF8"))
  # remove temp file
  file.remove(temp)
  rm(input, output)
}

编辑:现在也可用于beamer.

于 2019-09-04T14:44:05.667 回答
1

我可以重现该行为。请在https://github.com/daqana/tikzDevice/issues作为问题打开。作为一种解决方法,您可以使用

---
title: "test"
author: "test"
date: "Today"
output: 
  pdf_document: 
    keep_tex: true
header-includes:
   - \usepackage{tikz}
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
library(tikzDevice)
options(tikzDefaultEngine = "xetex")
```

```{r plot, dev="tikz", external=FALSE}
x <- rnorm(50)
y <- rnorm(50)

plot(x, y, xlab = '\\"O\\"A\\"U', ylab = '\\"o\\"a\\"u')
```
于 2019-09-04T14:06:02.327 回答
0

在 R 或 Python 中,在读取 CSV 或文本文件时使用 (r'') 示例 r'c:\hem\dow\train.csv' 我们必须声明 r'' 来读取文件。

于 2019-09-04T10:03:55.750 回答