我以前在 SAS 工作,然后出于学术要求的原因决定转向 R。我的数据(healthdemo)是包含一些健康诊断代码(ICD-10)的健康数据,我想将这些代码分成不同的列。这是 str(healthdemo) 的一部分:
$ PATIENT_KEY : int 7391510 7404298 7390196 7381208 7401691 7381223 7383005 10188634 7384574 7398317 ...
$ ICDCODE : Factor w/ 1125 levels "","H00","H00.0",..: 654 56 654 654 665 48 90 679 654 654 ...
$ PATIENT_ID : int 39387 50244 38388 27346 49922 27901 27867 61527 33186 45309 ...
$ DATE_OF_BIRTH : Factor w/ 14801 levels "","01/01/1000",..: 7506 10250 52 73 94 6130 85 2710 95 100 ...
ICDCODE 包含从 H00 到 J99 的多种疾病;首先,我将 ICDCODE 中的字母与数字分开
healthdemo$icd_char = substr(healthdemo$ICDCODE,1,1)
healthdemo$icd_num = substr(healthdemo$ICDCODE,2,2)
然后我通过这个函数创建了疾病列:
healthdemo$cvd = 0
healthdemo$ihd = 0
healthdemo$mi = 0
healthdemo$dys = 0
healthdemo$afib = 0
healthdemo$chf = 0
现在我想应用一个类似于这个 SAS 函数(我曾经使用过)的函数:
if icd_char = 'I' and 01 <= icd_num < 52 then cvd = 1;
if icd_char = 'I' and 20 <= icd_num <= 25 then ihd = 1;
if icd_char = 'I' and 21 <= icd_num <= 22 then mi = 1;
if icd_char = 'I' and 46 <= icd_num <= 49 then dys = 1;
if icd_char = 'I' and icd_num = 48 then afib = 1;
此函数将给每个患者分配给定的 ICD 字符和 ICD 编号到 cvd=1(例如)等等。
我尝试在 R 中使用这些函数,但它们对我不起作用:
healthdemo$cvd[healthdemo$icd_char == 'I' & 01 <= healthdemo$icd_num
& healthdemo$icd_num < 52 ] <- 1
还有这个
if (healthdemo$icd_char == "I" & 01 < = healthdemo$icd_num < 52 )
{healthdemo$cvd <- 1}
有人可以帮帮我吗?