2

我有一个调查应用程序,我需要对响应进行聚类以检测连贯或退连贯的迹象。

我正在使用AI4R,我的代码如下所示(示例代码来自 AI4R)

# 5 Questions on a post training survey
questions = [   "The material covered was appropriate for someone with my level of knowledge of the subject.", 
                "The material was presented in a clear and logical fashion", 
                "There was sufficient time in the session to cover the material that was presented", 
                "The instructor was respectful of students", 
                "The instructor provided good examples"]

# Answers to each question go from 1 (bad) to 5 (excellent)
# The answers array has an element per survey complemented. 
# Each survey completed is in turn an array with the answer of each question.
answers = [ 
            [ 1, 2, 3, 2, 2],   # Answers of person 1
            [ 5, 5, 3, 2, 2],   # Answers of person 2
          ]

data_set = DataSet.new(:data_items => answers, :data_labels => questions)

# Let's group answers in 4 groups
clusterer = Diana.new.build(data_set, 4)

这反过来又让我可以创建这样的图表(调查中有与主题/轴相关的问题)。

在此处输入图像描述

问题是现在你必须选择要传递给 AI4R 的集群数量。我如何使用 Ruby 来检测集群的数量(这个问题归结为统计学科......)。


输入肘部方法...

我在维基百科上看到有一种叫做肘法的技术(插图来自维基百科),

在此处输入图像描述

它将聚类的数量与它们解释的方差进行比较。这种技术非常适合我的需要,但我不知道如何在 Ruby 中实现它。(我在本科时做过 ANOVA,所以我明白了它们的含义,但这就是它停止的地方。我可能还需要在统计论坛上交叉发布)。

是否有 Ruby 库可以帮助解决这个问题,我还没有偶然发现,或者如何使用 Ruby 生态系统解决这个问题?

4

0 回答 0