tabs - 带零的统计表

Question

我对 Stata 和命令有疑问：

svy: tab x

当没有对给定的观测值时x。

我的问题是，当没有对某个类别的观察时，Stata 只会删除相应的行。

我的任务是运行几个表格并将关键结果保存并导出到 csv 文件。有时存储的向量有 n 个元素，而有时由于零，它们只有 n-1 个元素，所以我不知道如何将它们组合在一个更大的矩阵中（或者至少将它导出到一个行之间有规则间距的文件中，并且如果没有观测值，则值为 0）。我也试过

estpost svy, subpop(x0): tab x, count se format(%10.4g)

但我仍然有同样的问题。

score 1 · Accepted Answer

更新 3 此解决方案基于estpost svy: tab该命令返回的可用结果向量比svy: tab其本身更多。与以前的版本一样，此解决方案将所有这些结果放入 Stata 数据集中。它在诉诸循环之前添加了对数据是否包含缺失类别的检查，并略微收紧了循环限制。按照 Nick 的建议，缺失值被替换为所有与标准错误相关的统计数据。注意

 estpost svy: tab rep78

默认情况下，将估计的单元格比例放入e(b)其中，并将它们的标准误差放入e(se)中，而

 estpost svy: tab rep78, count

将估计的计数及其 SE 放入这些矩阵中。但是，其他摘要仍然可用，在e(cell)或中e(count)。

 sysuse auto, clear
 drop if rep78==2 |rep78==5

 svyset _n [pw = turn]
 estpost svy: tab rep78,  se

 /* Number categories from 1 to max */
 local maxcat = 5
 mata:
 /* count rows, add one for totals row
   assign the category for that row as .a */
 r = (st_matrix("e(Row)"), .a)'
 b = st_matrix("e(b)")'
 serr = st_matrix("e(se)")'
 lb = st_matrix("e(lb)")'
 ub = st_matrix("e(ub)")'
 def = st_matrix("e(deff)")'
 dft = st_matrix("e(deft)")'
 ct = st_matrix("e(count)")'
 pr = st_matrix("e(cell)")'
 obs = st_matrix("e(obs)")'
 d1 =(r , b, serr, lb, ub, def, dft, obs, pr, ct)

 /*  Where there are no totals, use a standard missing value */
 d1[rows(d1),3::7] = J(1,5, .)

 /* Check if there are no missing rows.
 If so, output the original returned matrices */
 if (`e(r)' ==`maxcat') d = d1
 /* Else create a zero matrix and populate it
 with statistics for the non-missing categories*/
 else {   
     d2= J(`maxcat',10,0)
     d2[.,1] =(1::`maxcat')
     for (j = 1; j<=`e(r)'; j++) {
        for (k = 1; k<=r[j,1]; k++) {
           if (r[j,1]== k) {
                   d2[k,2] = b[j,1]
                   d2[k,3] = serr[j,1]
                   d2[k,4] = lb[j,1]
                   d2[k,5] = ub[j,1]
                   d2[k,6] = def[j,1]
                   d2[k,7] = dft[j,1]
                   d2[k,8] = obs[j,1]
                   d2[k,9] = pr[j,1]
                   d2[k,10] = ct[j,1]
            }
        }
    }
 /* If rows are missing set SE-realated stats to missing*/
   for (k = 1; k<=`maxcat'; k++) {
      if (d2[k,2] == 0)  d2[k,3..7] =J(1,5,.)
    }
 /* Now add the totals row */
 d = d2 \ d1[rows(d1),.]
 }
 end
 clear
 getmata (rep78 b se lb ub deff deft nobs prop count ) = d
 format  b se lb ub deff deft prop  %5.2f
 format nobs count %10.0gc
 label define rtot  .a "Totals"
 label values rep78 rtot
 list
 save results, replace

原始答案这是一种创建new包含零类别的矩阵的方法。逻辑：设置一个零矩阵来保存所有类别的结果；然后用非缺失类别中的值替换零。宏maxcat包含列表变量的最大类别数。该代码假定表变量中的类别是从 1 到的整数maxcat。该mata块提取标准误差向量，标量e(r)保存实际表中的行数。

 sysuse auto, clear
 svyset _n
 drop if rep78== 2 | rep78==5
 svy: tab rep78, count se

 local  maxcat = 5  //max no. of categories
 matrix  oldr = e(Row)'   // category values
 matrix ct = e(Obs)  // table counts

 // serr is a vector of std. errors
 mata: st_matrix("serr", sqrt(diagonal(st_matrix("e(V)"))))

 // matrix new  will hold the expanded results
 matrix new = J(`maxcat', 3, 0)

 forvalues j = 1/`=e(r)' {
 forvalues k = 1/`maxcat'{
 matrix new[`k',1] = `k'
 if oldr[`j',1]== `k'  {
 matrix new[`k',2] = ct[`j',1]
 matrix new[`k',3] = serr[`j',1]
 }
 }
 }
 matrix list new

更新 2：这是一个在 Mata 中完成大部分工作的版本，然后将估计值保存到 Stata 数据集中。我稍微更改了矩阵的名称。

 sysuse auto, clear
 svyset _n
 drop if rep78== 2 | rep78==5
 svy: tab rep78, count se
 local maxcat =5

 mata:
 r = st_matrix("e(Row)")'
 ct = st_matrix("e(Obs)")
 serr= sqrt(diagonal(st_matrix("e(V)")))
 d = J(`maxcat',3,0)
 for (j = 1; j<=`e(r)'; j++) {
     for (k = 1; k<=`maxcat'; k++) {
         d[k,1] = k
         if (r[j,1]== k) {
            d[k,2] =   ct[j,1]
            d[k,3] = serr[j,1]
         }
    }
  }
 end
 clear
 getmata (rep78 count se) = d
 replace se = . if count==0
 format se %8.2f
 list
 save results, replace

tabs - 带零的统计表

1 回答 1

Related

Reference