作为 arules 包中“检查”功能的结果,我想添加两个额外的措施。他们是Kulczynski和不平衡比率。你能帮我提供信息,在哪里可以找到检查功能的代码以及如何修改它。
谢谢
不平衡很简单:
library(arules)
data("Income")
rules <- apriori(Income)
suppA <- support(lhs(rules), trans = Income)
suppB <- support(rhs(rules), trans = Income)
suppAB <- quality(rules)$supp
quality(rules)$imbalance <- abs(suppA - suppB)/(suppA + suppB - suppAB)
inspect(head(rules))
lhs rhs support confidence lift imbalance
1 {} => {language in home=english} 0.9128854 0.9128854 1.000000 0.03082862
2 {occupation=clerical/service} => {language in home=english} 0.1127109 0.9292566 1.017933 0.69021050
3 {ethnic classification=hispanic} => {education=no college graduate} 0.1096568 0.8636884 1.224731 0.61395923
4 {dual incomes=no} => {marital status=married} 0.1400524 0.9441176 2.447871 0.35210356
5 {dual incomes=no} => {language in home=english} 0.1364165 0.9196078 1.007364 0.63837280
6 {occupation=student} => {marital status=single} 0.1449971 0.8838652 2.160490 0.34123127
Kulczynski 测度 1/2(P(A|B)+P(B|A)) 有点棘手。P(A|B) 只是 A->B 的置信度。但是,对于 P(B|A),我们需要 B->A 的置信度。所以我们需要创建一组新的规则,左右两边互换并计算置信度:
confAB <- quality(rules)$conf
BArules <- new("rules", lhs = rhs(rules), rhs = lhs(rules))
confBA <- interestMeasure(BArules, method = "confidence", trans = Income)
quality(rules)$kulczynski <- .5*(confAB + confBA)
inspect(head(rules))
lhs rhs support confidence lift imbalance kulczynski
1 {} => {language in home=english} 0.9128854 0.9128854 1.000000 0.03082862 0.9564427
2 {occupation=clerical/service} => {language in home=english} 0.1127109 0.9292566 1.017933 0.69021050 0.5263616
3 {ethnic classification=hispanic} => {education=no college graduate} 0.1096568 0.8636884 1.224731 0.61395923 0.5095922
4 {dual incomes=no} => {marital status=married} 0.1400524 0.9441176 2.447871 0.35210356 0.6536199
5 {dual incomes=no} => {language in home=english} 0.1364165 0.9196078 1.007364 0.63837280 0.5345211
6 {occupation=student} => {marital status=single} 0.1449971 0.8838652 2.160490 0.34123127 0.6191456
您需要做的就是向质量 data.frame 添加其他列。Inspect 会自动选择这些。这是来自的示例? interestMeasure
:
data("Income")
rules <- apriori(Income)
## calculate a single measure and add it to the quality slot
quality(rules) <- cbind(quality(rules),
hyperConfidence = interestMeasure(rules, method = "hyperConfidence",
transactions = Income))
inspect(head(sort(rules, by = "hyperConfidence")))
lhs rhs support confidence lift hyperConfidence
1 {ethnic classification=hispanic} => {education=no college graduate} 0.1096568 0.8636884 1.224731 1
2 {dual incomes=no} => {marital status=married} 0.1400524 0.9441176 2.447871 1
3 {occupation=student} => {marital status=single} 0.1449971 0.8838652 2.160490 1
4 {occupation=student} => {age=14-34} 0.1592496 0.9707447 1.658345 1
5 {occupation=student} => {dual incomes=not married} 0.1535777 0.9361702 1.564683 1
6 {occupation=student} => {income=$0-$40,000} 0.1381617 0.8421986 1.353027 1