0

我正在进行网络分析,我有一个看起来像这样的数据集

**ID-code | ego  |  alter1  |alter2 |alter3 |Office**
100       | JHON |  ROCKY   |JOE    |MOLLY  |   1
101       |ROCKY |  JOE     |MOLLY  |JHON   |   1
102       | JOE  |  MOLLY   |JHON   |  .    |   1
103       | MOLLY|  ROCKY   | .     |  .    |   1 

正如你所看到的,每个自我都被要求从同一个办公室说出最多三个改变的名字。

我想将 ID 代码与其名称相匹配,以获得类似这样的新变量/列

   **ID-code ego|   ID_alter1   |ID_alter2  |ID_alter3**
    100JHON     |    101ROCKY   |102JOE     |103MOLLY
    101ROCKY    |    102JOE     |103MOLLY   |100JHON
    102JOE      |    103MOLLY   |100JHON    |    .
    103MOLLY    |    101ROCKY   |  .        |    .

我已经知道如何获取变量 ID-code ego:

*egen ID-code ego= concat (ID-code ego)*

但我不知道如何将其他观察结果与他们的 ID 代码相匹配。

欢迎任何建议。

谢谢, 阿梅迪奥

4

2 回答 2

1

Kevin Crow 编写了一个 vlookup 克隆,使这变得非常容易:

clear
input int id_code str5 ego str5 alter1 str5 alter2 str5 alter3
100 "JOHN" "ROCKY" "JOE" "MOLLY"
101 "ROCKY" "JOE" "MOLLY" "JOHN"
102 "JOE" "MOLLY" "JOHN" ""
103 "MOLLY" "ROCKY" "" ""
end
capture net install vlookup, from(http://www.stata.com/users/kcrow)
gen id_code_ego = string(id) + ego
forvalues i=1/3 {
    vlookup alter`i', gen(code) key(ego) value(id_code)
    gen id_alter`i' = string(code) + alter`i'
    drop alter`i' code
}
drop id_code ego

附录:

clear
input int id_code str5 ego str5 alter1 str5 alter2 str5 alter3 int officer
100 "JOHN" "ROCKY" "JOE" "MOLLY" 1
101 "ROCKY" "JOE" "MOLLY" "JOHN" 1
102 "JOE" "MOLLY" "JOHN" "" 1
103 "MOLLY" "ROCKY" "" "" 1
103 "JOHN" "ROCKY" "JOE" "MOLLY" 2
102 "ROCKY" "JOE" "MOLLY" "JOHN" 2
101 "JOE" "MOLLY" "JOHN" "" 2
100 "MOLLY" "ROCKY" "" "" 2
end
capture net install vlookup, from(http://www.stata.com/users/kcrow)

gen id_code_ego_officer = string(id) + ego + string(officer)
gen ego_officer = ego + string(office)

forvalues i=1/3 {
    replace alter`i'= alter`i' + string(officer) 
    vlookup alter`i', gen(code) key(ego_officer) value(id_code)
    gen id_alter`i' = string(code) + alter`i'
    replace id_alter`i' = regexr(id_alter`i',"[0-9]?$","")
    drop alter`i' code  
}

drop id_code_ego_officer ego_officer
于 2016-05-13T07:24:22.827 回答
1

为了匹配来自其他观察的值,Stata 中的典型方法是使用merge. 第一步,您创建每个办公室的不同自我价值的主列表。然后您返回原始数据并将每个更改与不同的办公室名称合并。执行合并需要一些变量名重命名:

clear
input int id_code str5 ego str5 alter1 str5 alter2 str5 alter3 int office
100 "JOHN" "ROCKY" "JOE" "MOLLY" 1
101 "ROCKY" "JOE" "MOLLY" "JOHN" 1
102 "JOE" "MOLLY" "JOHN" "" 1
103 "MOLLY" "ROCKY" "" "" 1
103 "JOHN" "ROCKY" "JOE" "MOLLY" 2
102 "ROCKY" "JOE" "MOLLY" "JOHN" 2
101 "JOE" "MOLLY" "JOHN" "" 2
100 "MOLLY" "ROCKY" "" "" 2
end

* make a master list of unique id/name per office
preserve
keep office id_code ego
isid office id_code ego, sort
rename (id_code ego) (id0 ego0)
save "match_egos.dta", replace
restore

* combine the id/ego for each observation
gen ID_ego = string(id_code) + ego

* loop over each alter and merge with the master list
forvalues i = 1/3 {
    clonevar ego0 = alter`i'
    merge m:1 office ego0 using "match_egos.dta", keep(master match) nogen
    gen ID_alter`i' = string(id0) + alter`i'
    drop ego0 id0
}

isid office id_code ego, sort
* leftalign is from SSC; to install, type in Command window: ssc install left align
leftalign
list ID_*, sepby(office)
于 2016-05-13T20:00:09.640 回答