python - 在 Python 中运行程序 (R) 以执行操作（执行脚本）的问题

Question

我想从 python 执行一个 R 脚本，理想情况下显示和保存结果。使用 rpy2 有点困难，所以我想我应该直接调用 R。我有一种感觉，我需要使用“os.system”或“subprocess.call”之类的东西，但我很难解读模块指南。

这是 R 脚本“MantelScript”，它使用特定的统计测试来一次比较两个距离矩阵（distmatA1 和 distmatB1）。这在 R 中有效，尽管我还没有放入迭代位以便以成对的方式通读和比较一堆文件（我真的需要一些帮助，顺便说一句！）：

library(ade4)

M1<-read.table("C:\\pythonscripts\\distmatA1.csv", header = FALSE, sep = ",")
M2<-read.table("C:\\pythonscripts\\distmatB1.csv", header = FALSE, sep = ",")

mantel.rtest(dist(matrix(M1, 14, 14)), dist(matrix(M2, 14, 14)), nrepet = 999)

这是我的 python 脚本的相关位，它读取了一些以前制定的列表并拉出矩阵，以便通过这个 Mantel 测试进行比较（它应该从 identityA 中提取第一个矩阵并顺序将其与 identityB 中的每个矩阵进行比较，然后重复与来自identityB的第二个矩阵等）。我想保存这些文件，然后调用 R 程序来比较它们：

# windownA and windownB are lists containing ascending sequences of integers
# identityA and identityB are lists where each field is a distance matrix.

z = 0
v = 0

import subprocess
import os

for i in windownA:                              

    M1 = identityA[i]                          

    z += 1
    filename = "C:/pythonscripts/distmatA"+str(z)+".csv"
    file = csv.writer(open(filename, 'w'))
    file.writerow(M1)


    for j in windownB:                          

        M2 = identityB[j]                     

        v += 1
        filename2 = "C:/pythonscripts/distmatB"+str(v)+".csv"
        file = csv.writer(open(filename2, 'w'))
        file.writerow(M2)

        ## result = os.system('R CMD BATCH C:/R/library/MantelScript.R') - maybe something like this??

        ## result = subprocess.call(['C:/R/library/MantelScript.txt'])  - or maybe this??

        print result
        print ' '

score 5 · Accepted Answer

如果你的 R 脚本只有副作用，那很好，但如果你想用 Python 进一步处理结果，你仍然会更好地使用 rpy2。

import rpy2.robjects
f = file("C:/R/library/MantelScript.R")
code = ''.join(f.readlines())
result = rpy2.robjects.r(code)
# assume that MantelScript creates a variable "X" in the R GlobalEnv workspace
X = rpy2.rojects.globalenv['X']

score 2 · Accepted Answer

坚持这一点。

process = subprocess.Popen(['R', 'CMD', 'BATCH', 'C:/R/library/MantelScript.R'])
process.wait()

当wait()函数返回一个值时，.R文件就完成了。

请注意，您应该编写 .R 脚本以生成 Python 程序可以读取的文件。

with open( 'the_output_from_mantelscript', 'r' ) as result:
    for line in result:
        print( line )

不要浪费大量时间尝试连接管道。

花时间让一个基本的“Python spawns R”流程正常工作。

您可以稍后添加。

score 2 · Accepted Answer

如果您有兴趣从 Python 调用 R 子进程。

#!/usr/bin/env python3

from io import StringIO
from subprocess import PIPE, Popen

def rnorm(n):
    rscript = Popen(["Rscript", "-"], stdin=PIPE, stdout=PIPE, stderr=PIPE)
    with StringIO() as s:
        s.write("x <- rnorm({})\n".format(n))
        s.write("cat(x, \"\\n\")\n")
        return rscript.communicate(s.getvalue().encode())

if __name__ == '__main__':
    output, errmsg = rnorm(5)
    print("stdout:")
    print(output.decode('utf-8').strip())
    print("stderr:")
    print(errmsg.decode('utf-8').strip())

最好通过 Rscript 来完成。

score 0 · Accepted Answer

鉴于您正在尝试做的事情，纯 R 解决方案可能会更整洁：

file.pairs <- combn(dir(pattern="*.csv"), 2) # get every pair of csv files in the current dir

这些对是 2xN 矩阵中的列：

file.pairs[,1]
[1] "distmatrix1.csv" "distmatrix2.csv"

您可以使用 apply 对这些列运行函数（使用选项“2”，意思是“对列进行操作”）：

my.func <- function(v) paste(v[1], v[2], sep="::")
apply(file.pairs, 2, my.func)

在此示例my.func中，只是将两个文件名粘合在一起；您可以将其替换为执行 Mantel 测试的函数，例如（未经测试）：

my.func <- function(v){
  M1<-read.table(v[1], header = FALSE, sep = ",")
  M2<-read.table(v[2], header = FALSE, sep = ",")
  mantel.rtest(dist(matrix(M1, 14, 14)), dist(matrix(M2, 14, 14)), nrepet = 999)
}

python - 在 Python 中运行程序 (R) 以执行操作（执行脚本）的问题

4 回答 4

Related

Reference