1

我想用 Linq 表达下面的公式

海灵格距离公式

我有以下功能

private double Calc(IEnumerable<Frequency> recording, IEnumerable<Frequency> reading)
{
}

在哪里Frequency

public class Frequency
{
  public double Probability { get; set; } //which are p's and q's in the formula
  public int Strength { get; set; } //the i's i the formula 
}

对该函数的示例调用是

public void Caller(){
   IEnumerable<Frequency> recording = new List<Frequency>
                                            {
                                               new Frequency {Strength = 32, Probability = 0.2}, //p32 = 0.2
                                               new Frequency {Strength = 33, Probability = 0.2}, //p33 = 0.2
                                               new Frequency {Strength = 34, Probability = 0.2}, //p34 = 0.2
                                               new Frequency {Strength = 35, Probability = 0.2}, //...
                                               new Frequency {Strength = 41, Probability = 0.2} //...
                                            };

   IEnumerable<Frequency> reading = new List<Frequency>
                                            {
                                               new Frequency {Strength = 34, Probability = 0.2}, //q34 = 0.2
                                               new Frequency {Strength = 35, Probability = 0.2},  //q35 = 0.2
                                               new Frequency {Strength = 36, Probability = 0.2},
                                               new Frequency {Strength = 37, Probability = 0.2},
                                               new Frequency {Strength = 80, Probability = 0.2},
                                            };
   Calc(reading, recordig);
}

例如,new Frequency {Strength = 32, Probability = 0.2},表示p32 = 0.2在 Hellinger 公式中。

k公式中将为 100,如果集合中不存在元素,则其值为 0。例如,记录仅具有 i = 32,33, 34,35,41 的值,因此对于 1-100 pi 中的其他值将为零。

我的第一个实现是

  private double Calc(IEnumerable<Frequency> recording, IEnumerable<Frequency> reading)
  {
     double result = 0;

     foreach (var i in Enumerable.Range(1,100))
     {
        var recStr = recording.FirstOrDefault(a => a.Strength == i);
        var readStr = reading.FirstOrDefault(a => a.Strength == i);
        var recVal = recStr == null ? 0 : recStr.Probability;
        var readVal = readStr == null ? 0 : readStr.Probability;

        result += Math.Pow(Math.Sqrt(recVal) - Math.Sqrt(readVal), 2);
     }

     result = Math.Sqrt(result/2);
     return result;
  }

这既不高效也不优雅。我觉得解决方案可以改进,但我想不出更好的方法。

4

2 回答 2

1

Resharper 将您的功能变成这样:

double result = (from i in Enumerable.Range(1, 100) 
                 let recStr = recording.FirstOrDefault(a => a.Strength == i) 
                 let readStr = reading.FirstOrDefault(a => a.Strength == i) 
                 let recVal = recStr == null ? 0 : recStr.Probability 
                 let readVal = readStr == null ? 0 : readStr.Probability 
                 select Math.Pow(Math.Sqrt(recVal) - Math.Sqrt(readVal), 2)).Sum();


return Math.Sqrt(result / 2);

正如 Patashu 所说,您可以使用 aDictionary<int, Frequency>来获得 O(1) 查找时间:

private double Calc(Dictionary<int, Frequency> recording, Dictionary<int, Frequency> reading)
{
    double result = (from i in Enumerable.Range(1, 100) 
                     let recVal = recording.ContainsKey(i) ? 0 : recording[i].Probability 
                     let readVal = reading.ContainsKey(i) ? 0 : reading[i].Probability 
                     select Math.Pow(Math.Sqrt(recVal) - Math.Sqrt(readVal), 2)).Sum();

    return Math.Sqrt(result / 2);
}
于 2013-05-06T22:44:41.770 回答
1

这个问题很复杂,因为列表很稀疏(我们没有所有读数的概率)。所以,首先我们解决这个问题:

public static IEnumerable<Frequency> FillHoles(this IEnumerable<Frequency> src, int start, int end) {
    IEnumerable<int> range = Enumerable.Range(start, end-start+1);
    var result = from num in range
                 join _freq in src on num equals _freq.Strength into g
                 from freq in g.DefaultIfEmpty(new Frequency { Strength = num, Probability = 0 })
                 select freq;
    return result;
}

这给我们留下了一系列密集的频率读数。现在我们只需要应用公式:

// Make the arrays dense
recording = recording.FillHoles(1, 100);
reading = reading.FillHoles(1, 100);
// This is the thing we will be summing
IEnumerable<double> series = from rec in recording
                            join read in reading on rec.Strength equals read.Strength
                            select Math.Pow(Math.Sqrt(rec.Probability)-Math.Sqrt(read.Probability), 2);

double result = 1 / Math.Sqrt(2) * Math.Sqrt(series.Sum());
result.Dump();

不过,不确定这是否会比您拥有的性能更高。

于 2013-05-06T23:16:59.373 回答