-1

I want to discuss with you which similarity measure which meets my requirements. My vectors could be look like that:

A = (-4,0,4,null)
B = (-2,null,-4,null)
C = (4,4,4,4)
D = (0,0,0,0)
E = (null,null,null,null)
F = (-4,-4,-4,-4)

The values are activity values in a range from -5 to +5. The value of 0 stand for an non active value and values near -5 and +5 stand for an high active value. So i am searching for the right similarity measure.

I want to get the similarity between all combinations of the these vectors. I think the similarity between C and F must be 1 and the similarity between C and D must be 0:

C:E = 0
C:F = 1
C:D = 0
A:B = i think something over 0.5

I hope you unterstand my requirements. My question is now: which similarity measure could meet my requirements?

EDIT:

  • 0 is not the same as null. null is really not defined
  • The similarity measure needs only to calculate the similarity between two vectors
4

1 回答 1

1

这相当复杂,首先要使 C 和 F 相似,您要从取绝对值开始。同样,看起来 null 应该被翻译成 0。

这将导致向量的元素仅在 0..5 范围内,这稍微简化了问题。

然后问题是你想怎么做,从组件明智的差异开始可能是一个好的开始,然后问题是如何将它们加权在一起,随机猜测可能只是线性组合,也可能是二次组合。

确实,在最后一步中说任何有用的东西在很大程度上取决于您的用例,但我认为如果您可以从将所有元素都置于 0..5 范围内开始,就会收获很多。

于 2013-04-19T08:16:00.167 回答