我重新安排了我的方法,使其更加清晰。你提到你不能重新编码你的变量,但我不确定是否有办法解决这个问题(我认为这里的任何解决方案都可以显式或隐式地重新编码)。当然,您需要始终将“4”替换为“20”。
* generate some projects and members
clear
set obs 5
generate int project = _n
generate person_1 = "Tom"
generate person_2 = "Dick" if (_n >= 3)
generate person_3 = "Harry" if (_n >=5)
replace person_1 = "Jane" if inlist(_n, 2, 4)
tempfile orig
save `orig'
* reshape to long
reshape long person_, i(project) string
drop _j
drop if missing(person)
sort project person
egen id = group(person)
drop if missing(id)
reshape wide person, i(project) j(id)
* recode to allow easier group identification
forvalues i = 1/4 {
levelsof person_`i', local(name) clean
generate byte d_person_`i' = cond(missing(person_`i'), 0, 1)
label define d_person_`i'_lbl 1 "`name'" 0 ""
label values d_person_`i' d_person_`i'_lbl
}
* determine number of workers on project
egen gp_size = rowtotal(d_person_*)
* unique id for each group composition
generate int id = 0
forvalues i = 1/4 {
local two_i = 2^(`i' - 1)
replace id = id + d_person_`i' * `two_i'
}
* group members
generate str mbrs = ""
forvalues i = 1/4 {
local name: label d_person_`i'_lbl 1
replace mbrs = mbrs + "/" + "`name'" if (d_person_`i' == 1)
}
* there's always a leading "/" to remove with this approach
replace m = substr(m, 2, .)
* merge back your orig data
merge 1:1 project using `orig', nogenerate replace update
这产生:
. list
+---------------------------------------------------------------------------------------------------------------------------------+
| project person_1 person_2 person_3 person_4 d_pers~1 d_pers~2 d_pers~3 d_pers~4 gp_size id mbrs |
|---------------------------------------------------------------------------------------------------------------------------------|
1. | 1 Tom Tom Tom 1 8 Tom |
2. | 2 Jane Jane Jane 1 4 Jane |
3. | 3 Tom Dick Tom Dick Tom 2 9 Dick/Tom |
4. | 4 Jane Dick Jane Dick Jane 2 5 Dick/Jane |
5. | 5 Tom Dick Harry Tom Dick Harry Tom 3 11 Dick/Harry/Tom |
+---------------------------------------------------------------------------------------------------------------------------------+