9

作为 R 用户,我现在正在使用merge资源学习 Stata,并且对命令感到困惑。

在 R 中,我不必担心错误地合并数据,因为它无论如何都会合并所有内容。如果公共列包含任何重复项,我无需担心,因为Y数据框将合并到数据框中的每个重复行X。(all=FALSE在 中使用merge

X但是对于 Stata,我需要在继续合并之前删除重复的行。

在 Stata 中是否假设,为了merge继续,主表中的公共列必须是唯一的?

4

2 回答 2

6

The answer to your question is No. I will try to explain why.

The link you mention covers only one type of merge that is possible with Stata, namely the one-to-many merge.

merge 1:m varlist using filename

Other types of merge are possible:

One-to-one merge on specified key variables

merge 1:1 varlist using filename

Many-to-one merge on specified key variables

merge m:1 varlist using filename

Many-to-many merge on specified key variables

merge m:m varlist using filename

One-to-one merge by observation

merge 1:1 _n using filename

Details, explanations and examples can be found in help merge.

If you do not know if observations are unique in a dataset, you can do the following check:

bysort idvar: gen N = _N

ta N

If you find values of N that are greater than 1, you know that observations are not unique with respect to idvar.

This is in fact the new syntax of the merge command that has been introduced with Stata 11. Before Stata 11, the merge command was a bit simpler. You simply had to sort your data, and then you could do:

merge varlist using filename

By the way, you can still use this old syntax in Stata 11 or higher.

于 2011-09-07T09:07:26.050 回答
0

joinby, unmatched(both) 是对应于 R 命令合并的命令。

特别是合并 m:m 不会执行多对多合并(即完全连接),这与文档所暗示的相反。

于 2015-06-05T00:13:06.643 回答