我在这个for
循环中出错的地方,它意味着将特定的语料库、样本大小和样本数量作为输入,然后给出预期编号的平均值和标准偏差。情绪标记?
def test_iterate(corpus_reader, sample_size, number_of_samples):
for i in xrange(number_of_samples):
tokens = corpus_reader.sample_words_by_sents(sample_size)
sents = corpus_reader.sample_sents(sample_size)
print expected_sentiment_tokens(tokens)
s = []
s.append(expected_sentiment_tokens(tokens))
s = array(s)
print "Average expected no of sentiment tokens: %s" % average(s)
print "Standard deviation of sentiment tokens: %s" % std(s)
test_iterate(rcr, 500, 3)
返回
181.166666667
186.277777778
185.5
Average expected no of sentiment tokens: 185.5
Standard deviation of sentiment tokens: 0.0
由于某种原因,平均值被设置为最后一个样本,而不是对所有样本进行平均和标准偏差。