r - 使用进程替换从 R 中的 system() 调用 comm

Question

出于效率原因，我想通过 system() 在 R 中调用 comm。我已经习惯于使用如下语法：

comm -13 <(hadoop fs -cat /path/to/file | gunzip | awk -vFPAT='([^,]*)|("[^"]+")' -vOFS=, '{if($7 ~ /^".*"$/ && $9 ~ /^".*"$/) {print toupper($7),toupper($9)} else if($7 ~ /^[^"]/ && $9 ~ /^["]/) {print "\""toupper($7)"\"",toupper($9)} else if($7 ~ /^[^"]/ && $9 ~ /^[^"]/) {print "\""toupper($7)"\"","\""toupper($9)"\""}}' | sort) <(awk -vFPAT='([^,]*)|("[^"]+")' -vOFS=, '{if($1 ~ /^".*"$/ && $2 ~ /^".*"$/) {print toupper($1),toupper($2)} else if($1 ~ /^[^"]/ && $2 ~ /^["]/) {print "\""toupper($1)"\"",toupper($2)} else if($1 ~ /^[^"]/ && $2 ~ /^[^"]/) {print "\""toupper($1)"\"","\""toupper($2)"\""}}' /path/to/file | sort)

但是当从系统中使用这种语法时，如

system("comm -13 <(filea) <fileb)")

我得到了熟悉的错误：

sh: -c: line 0: syntax error near unexpected token `('

从上面很明显 system() 使用的是 sh 而不是 bash，并且不支持进程替换。阅读其他文章后，我尝试使用

system("bash -c 'comm -13 <(hadoop fs -cat /path/to/file | gunzip | awk -vFPAT='([^,]*)|(\"[^\"]+\")' -vOFS=, '{if($7 ~ /^\".*\"$/ && $9 ~ /^\".*\"$/) {print toupper($7),toupper($9)} else if($7 ~ /^[^\"]/ && $9 ~ /^[\"]/) {print \"\\\"\"toupper($7)\"\\\"\",toupper($9)} else if($7 ~ /^[^\"]/ && $9 ~ /^[^\"]/) {print \"\\\"\"toupper($7)\"\\\"\",\"\\\"\"toupper($9)\"\\\"\"}}' | sort) <(awk -vFPAT='([^,]*)|(\"[^\"]+\")' -vOFS=, '{if($1 ~ /^\".*\"$/ && $2 ~ /^\".*\"$/) {print toupper($1),toupper($2)} else if($1 ~ /^[^\"]/ && $2 ~ /^[\"]/) {print \"\\\"\"toupper($1)\"\\\"\",toupper($2)} else if($1 ~ /^[^\"]/ && $2 ~ /^[^\"]/) {print \"\\\"\"toupper($1)\"\\\"\",\"\\\"\"toupper($2)\"\\\"\"}}' /path/to/file | sort)")

也就是说，根据需要转义双引号和反斜杠。但是，这会返回相同的错误：

sh: -c: line 0: syntax error near unexpected token `('

我猜这与系统（）中双引号字符串中 bash -c 中的单引号转义有关。我对如何在 system() 中的双引号字符串中管理 bash -c中的单引号感到有些困惑。我应该如何导航所有这些转义？

score 0 · Accepted Answer

为了解决这个问题，我只需要逃避里面的一切：

bash -c "[within]"

使用 bash 的转义规则 ( https://www.gnu.org/software/bash/manual/html_node/Double-Quotes.html )，以及 inside2 中的所有内容：

system("[within2]")

使用 R 的转义规则。

最终结果是双转义反斜杠和引号（bash 和 R），以及单转义 $（bash）。

r - 使用进程替换从 R 中的 system() 调用 comm

1 回答 1

Related

Reference