r - 在 R 中使用 t.test 的人群更大？如何告诉功能？

Question

我有一个关于使用 t.test 检查总体均值是否大于另一个的问题。

想象一下，我在数据框 d 中有 2 个变量：

Weight: Numerical variable (weight of people).
Anykids: Categorical variable that can be yes or no.

数据框将如下所示：

Anykids Weight
yes     70
yes     84
no      66
...     ..

我想检查有 anykids = yes 的人的体重平均值是否大于有 anykids = no 的人的体重。所以我会：

H0: m(weight_yes) = m(weight_no)
H1: m(weight_yes) > m(weight_no)

该功能将是：

t.test(weight~anykids, data = d, alternative = 'greater')

函数如何知道参数更大意味着 anykids = yes 的组而不是 anykids = no 的组？

如果我想检查假设：

H0: m(weight_no) = m(weight_yes)
H1: m(weight_no) > m(weight_yes)

该函数将具有相同的参数。我怎么知道更大意味着 anykids = yes o anykids = no？

score 0 · Accepted Answer

像许多具有因子的事物一样，R 根据因子水平的顺序进行选择。在您的情况下，您可以检查 usinglevels(Anykids)以提前发现哪个将用作函数中的x与yt.test()，或者可能使用relevel().

但t-test()结果也只会显示您考虑了哪一个。这里，在 iris 数据集中，versicolor 级别是第一位的，将考虑 versicolor 的平均 Sepal.Width 是否比 virginica 更大。

levels(iris$Species)
#> [1] "setosa"     "versicolor" "virginica"
test_data <- iris[iris$Species != 'setosa', ]
t.test(data = test_data, Sepal.Width ~ Species, alternative = "greater")
#> 
#>  Welch Two Sample t-test
#> 
#> data:  Sepal.Width by Species
#> t = -3.2058, df = 97.927, p-value = 0.9991
#> alternative hypothesis: true difference in means is greater than 0
#> 95 percent confidence interval:
#>  -0.3096707        Inf
#> sample estimates:
#> mean in group versicolor  mean in group virginica 
#>                    2.770                    2.974

r - 在 R 中使用 t.test 的人群更大？如何告诉功能？

1 回答 1

Related

Reference