5

作为 arules 包中“检查”功能的结果,我想添加两个额外的措施。他们是Kulczynski和不平衡比率。你能帮我提供信息,在哪里可以找到检查功能的代码以及如何修改它。

谢谢

4

2 回答 2

3

不平衡很简单:

library(arules)
data("Income")
rules <- apriori(Income)

suppA <- support(lhs(rules), trans = Income)
suppB <- support(rhs(rules), trans = Income)
suppAB <- quality(rules)$supp
quality(rules)$imbalance <- abs(suppA - suppB)/(suppA + suppB - suppAB)

inspect(head(rules))
  lhs                                 rhs                               support confidence     lift  imbalance
1 {}                               => {language in home=english}      0.9128854  0.9128854 1.000000 0.03082862
2 {occupation=clerical/service}    => {language in home=english}      0.1127109  0.9292566 1.017933 0.69021050
3 {ethnic classification=hispanic} => {education=no college graduate} 0.1096568  0.8636884 1.224731 0.61395923
4 {dual incomes=no}                => {marital status=married}        0.1400524  0.9441176 2.447871 0.35210356
5 {dual incomes=no}                => {language in home=english}      0.1364165  0.9196078 1.007364 0.63837280
6 {occupation=student}             => {marital status=single}         0.1449971  0.8838652 2.160490 0.34123127

Kulczynski 测度 1/2(P(A|B)+P(B|A)) 有点棘手。P(A|B) 只是 A->B 的置信度。但是,对于 P(B|A),我们需要 B->A 的置信度。所以我们需要创建一组新的规则,左右两边互换并计算置信度:

 confAB <- quality(rules)$conf
 BArules <- new("rules", lhs = rhs(rules), rhs = lhs(rules))
 confBA <- interestMeasure(BArules, method = "confidence", trans = Income)
 quality(rules)$kulczynski <- .5*(confAB + confBA)

 inspect(head(rules))
    lhs                                 rhs                               support confidence     lift  imbalance kulczynski
  1 {}                               => {language in home=english}      0.9128854  0.9128854 1.000000 0.03082862  0.9564427
  2 {occupation=clerical/service}    => {language in home=english}      0.1127109  0.9292566 1.017933 0.69021050  0.5263616
  3 {ethnic classification=hispanic} => {education=no college graduate} 0.1096568  0.8636884 1.224731 0.61395923  0.5095922
  4 {dual incomes=no}                => {marital status=married}        0.1400524  0.9441176 2.447871 0.35210356  0.6536199
  5 {dual incomes=no}                => {language in home=english}      0.1364165  0.9196078 1.007364 0.63837280  0.5345211
  6 {occupation=student}             => {marital status=single}         0.1449971  0.8838652 2.160490 0.34123127  0.6191456
于 2015-09-13T14:16:42.400 回答
3

您需要做的就是向质量 data.frame 添加其他列。Inspect 会自动选择这些。这是来自的示例? interestMeasure

data("Income")
rules <- apriori(Income)

## calculate a single measure and add it to the quality slot
quality(rules) <- cbind(quality(rules), 
  hyperConfidence = interestMeasure(rules, method = "hyperConfidence",
     transactions = Income))

inspect(head(sort(rules, by = "hyperConfidence")))

  lhs                                 rhs                                support confidence     lift hyperConfidence
1 {ethnic classification=hispanic} => {education=no college graduate} 0.1096568  0.8636884 1.224731               1
2 {dual incomes=no}                => {marital status=married}        0.1400524  0.9441176 2.447871               1
3 {occupation=student}             => {marital status=single}         0.1449971  0.8838652 2.160490               1
4 {occupation=student}             => {age=14-34}                     0.1592496  0.9707447 1.658345               1
5 {occupation=student}             => {dual incomes=not married}      0.1535777  0.9361702 1.564683               1
6 {occupation=student}             => {income=$0-$40,000}             0.1381617  0.8421986 1.353027               1
于 2015-09-11T15:01:21.270 回答