r - 将R列表作为宏带入Stata？

Question

我希望从 Stata 在 R 中运行 Lasso 模型，然后将结果字符列表（子集系数的名称）作为宏（例如，全局）带回 Stata。

目前我知道两种选择：

我使用以下方法保存dta文件并从 Stata 运行 R 脚本shell：
```
shell $Rloc --vanilla <"${LOC}/Lasso.R"
```
这适用于保存的dta文件，并允许我运行我希望运行但不是交互式的套索模型，因此我无法将相关字符列表（带有子集变量的名称）带回 Stata。
我使用rcall. 但是，rcall即使在最大 Stata 内存下，也不允许我加载足够大的矩阵。我的预测矩阵Z（被 Lasso 子集）是 1000 乘 100，但是当我运行命令时：
```
rcall: X <- st.matrix(Z) 
```
我收到一条错误消息：

宏替换导致行太长：由替换宏产生的行将比允许的长。最大允许长度为 645,216 个字符，根据 set maxvar 计算得出。

有没有办法从Stata交互式地运行R，它允许大型矩阵，这样我可以将R中的字符列表作为宏带回Stata？

提前致谢。

score 3 · Accepted Answer

下面我将尝试将评论合并为一个-希望有用的答案。

不幸的是，rcall它似乎不能很好地处理您需要的大型矩阵。我认为最好调用 R 来使用shell命令运行脚本并将字符串作为变量保存在dta文件中。这需要更多的工作，但它肯定是可编程的。

然后您可以将这些变量读入 Stata 并使用内置函数轻松操作它们。例如，您可以将字符串保存在单独的变量中或保存在一个变量中，然后levelsof按照@Dimitriy 的建议使用。

考虑以下玩具示例：

clear
set obs 5

input str50 string
"this is a string"
"A longer string is this"
"A string that is even longer is this one"
"How many strings do you have?"
end

levelsof string, local(newstr) 
`"A longer string is this"' `"A string that is even longer is this one"' `"How many strings do you have?"' `"this is a string"'

tokenize `"`newstr'"'

forvalues i = 1 / `: word count `newstr'' {
    display "``i''"
}

A longer string is this
A string that is even longer is this one
How many strings do you have?
this is a string

根据我的经验，程序喜欢rcall并且rsource对于简单的任务很有用。但是，对于更复杂的工作，它们可能会成为真正的麻烦，在这种情况下，我个人只是求助于真实的东西，即直接使用其他软件。

正如@Dimitriy 还指出的那样，现在有一些社区贡献的命令可用于lasso，ehich 可以满足您的需求，因此您不必摆弄 R：

search lasso

5 packages found (Stata Journal and STB listed first)
-----------------------------------------------------

elasticregress from http://fmwww.bc.edu/RePEc/bocode/e
    'ELASTICREGRESS': module to perform elastic net regression, lasso
    regression, ridge regression / elasticregress calculates an elastic
    net-regularized / regression: an estimator of a linear model in which
    larger / parameters are discouraged.  This estimator nests the LASSO / and

lars from http://fmwww.bc.edu/RePEc/bocode/l
    'LARS': module to perform least angle regression / Least Angle Regression
    is a model-building algorithm that / considers parsimony as well as
    prediction accuracy.  This / method is covered in detail by the paper
    Efron, Hastie, Johnstone / and Tibshirani (2004), published in The Annals

lassopack from http://fmwww.bc.edu/RePEc/bocode/l
    'LASSOPACK': module for lasso, square-root lasso, elastic net, ridge,
    adaptive lasso estimation and cross-validation / lassopack is a suite of
    programs for penalized regression / methods suitable for the
    high-dimensional setting where the / number of predictors p may be large

pdslasso from http://fmwww.bc.edu/RePEc/bocode/p
    'PDSLASSO': module for post-selection and post-regularization OLS or IV
    estimation and inference / pdslasso and ivlasso are routines for
    estimating structural / parameters in linear models with many controls
    and/or / instruments. The routines use methods for estimating sparse /

sivreg from http://fmwww.bc.edu/RePEc/bocode/s
    'SIVREG': module to perform adaptive Lasso with some invalid instruments /
    sivreg estimates a linear instrumental variables regression / where some
    of the instruments fail the exclusion restriction / and are thus invalid.
    The LARS algorithm (Efron et al., 2004) is / applied as long as the Hansen

r - 将R列表作为宏带入Stata？

1 回答 1

Related

Reference