2

I have two arrays:

import numpy as np
a = np.array(['1','2','3'])
b = np.array(['3','4','1','5'])

I want to calculate joint entropy. I've found some materials to make it like:

import numpy as np
def entropy(*X):
    return = np.sum(-p * np.log2(p) if p > 0 else 0 
        for p in (np.mean(reduce(np.logical_and, (predictions == c for predictions, c in zip(X, classes))))
        for classes in itertools.product(*[set(x) for x in X])))

Seems to work fine with len(a) = len(b) but it ends with error if len(a) != len(b)

UPD: Arrays a and b were created from exampled main input:

b:3 p1:1 p2:6 p5:7
b:4 p1:2 p7:2
b:1 p3:4 p5:8
b:5 p1:3 p4:4 

Where array a was created from p1 values. So not every line consists of every pK but every has b property. I need to calculate mutual information I(b,pK) for each pK.

4

2 回答 2

2

假设你在谈论联合香农熵,公式很简单:

在此处输入图像描述

当我看到你到目前为止所做的事情时,这个问题是你缺乏P(x,y),即两个变量一起发生的联合概率。看起来a,b分别是事件 a 和 b 的个体概率。

您发布的代码(在评论中提到)还有其他问题:

  1. 你的变量不是数值数据类型a=["1","2"]一样a=[1,2]一个是字符串,另一个是数字。
  2. 输入数据的长度必须相同,即对于 A 中的每个 x,B 中必须有 ay并且您需要知道联合概率P(x,y)
于 2013-09-16T14:13:12.720 回答
-1

这是一个想法:

  • 将数据转换为数字
  • 添加填充示例零
import numpy as np
from scipy import stats

a = np.array(['1','2','3','0'])
b = np.array(['3','4','1','5'])
aa = [int(x) for x in a]
bb = [int(x) for x in b]
je =  stats.entropy(aa,bb)
print("joint entropy : ",je)

输出:0.9083449242695364

于 2020-08-09T15:59:45.037 回答