0

我用 C# 编写了一个创建和训练神经网络的小项目。有关更多详细信息,请参阅我之前的问题:(https://scicomp.stackexchange.com/questions/19481)。

神经网络经过足够的训练后表现良好,但我意识到我自己编写的爬山算法可能并不完美,我正在寻找改进的建议。特别是,我可以通过较少的适应度评估函数调用达到局部最优吗?

对于 C# 中的简单爬山算法,网络上似乎没有很多示例。有 .NET 数学库,但我宁愿不必付费。

爬山算法在网络中的每个权重和每个偏差上运行以训练网络,我运行了多次。我研究了反向传播,但这似乎只适用于单个训练示例,我的训练数据中有大约 7000 个示例,适应度函数评估所有这些示例的网络平均性能并返回一个连续的 (双)得分。

这是我当前的代码:

    public static double ImproveProperty(ref double property, double startingFitness, int maxIters, Random r, ref Defs.Network network, Func<Defs.Network, double> fitnessFunction)
    {
        //Record starting values
        var lastFitness = startingFitness;
        var lastValue = property;
        //Randomise magnitude of change to reduce chance 
        //of getting stuck in local optimums
        var magnitude = r.NextDouble();
        var positive = true;
        var iterCount = 0f;
        var magnitudeChange = 5;
        while (iterCount < maxIters)
        {
            iterCount++;
            if (positive)
            {   //Try adding a positive value to the property
                property += magnitude;
                //Evaluate the fitness
                var fitness = fitnessFunction(network);
                if (fitness == lastFitness)
                {   //No change in fitness, increase the magnitude and re-try
                    magnitude *= magnitudeChange;
                    property = lastValue;
                }
                else if (fitness < lastFitness)
                {   //This change decreased the fitness (bad)
                    //Put the property back and try going in the negative direction
                    property = lastValue;
                    positive = false;
                }
                else
                {   //This change increased the fitness (good)
                    //on the next iteration we will try 
                    //to apply the same change again
                    lastFitness = fitness;
                    lastValue = property;
                    //don't increase the iteration count as much
                    //if a good change was made
                    iterCount -= 0.9f;
                }
            }
            else
            {   //Try adding a negative value to the property
                property -= magnitude;
                var fitness = fitnessFunction(network);
                if (fitness == lastFitness)
                {
                    //No change in fitness, increase the magnitude and re-try
                    magnitude *= magnitudeChange;
                    property = lastValue;
                }
                else if (fitness < lastFitness)
                {
                    //This change decreased the fitness (bad)
                    //Now we know that going in the positive direction 
                    //and the negative direction decreases the fitness
                    //so make the magnitude smaller as we are probably close to an optimum
                    property = lastValue;
                    magnitude /= magnitudeChange;
                    positive = true;
                }
                else
                {
                    //This change increased the fitness (good)
                    //Continue in same direction
                    lastFitness = fitness;
                    lastValue = property;
                    iterCount -= 0.9f;
                }
            }
            //Check bounds to prevent math functions overflowing
            if (property > 100)
            {
                property = 100;
                lastFitness = fitnessFunction(network);
                return lastFitness;
            }
            else if (property < -100)
            {
                property = -100;
                lastFitness = fitnessFunction(network);
                return lastFitness;
            }
        }
        return lastFitness;
    }

适应度函数非常昂贵,因此应该尽可能少地调用它。我正在寻找通过减少对适应度函数的调用来达到局部最优的任何改进。陷入局部最优并不是什么大问题,我已经绘制了适应度函数与网络中不同权重和偏差值的关系图,看起来图中通常有 1-3 个局部最优。如果网络在几次传递中保持相同的适应度,那么我可以向该函数添加一个参数,以尝试从随机值重新开始爬山。

4

1 回答 1

0

这种方法不会真正扩展。您试图多次评估非常昂贵的总适应度函数,只是为了在单个参数上获得小幅改进。这就是为什么基于梯度的方法逐个样本地查看优化问题,或者更常见的是逐个小批量地查看优化问题的全部原因。适应度函数分解为每个样本(或批次)的适应度函数之和,这使您可以计算一个小的更新并朝着正确的方向迈出一步。

你应该稍微了解一下这个理论。有很多很好的在线资源,例如这本在线书籍

或者,如果你对神经网络背后的理论不太感兴趣,只是想应用它们,我建议不要重新发明轮子,而只是使用神经网络的众多开源工具包之一,例如 Torch7,咖啡,pylearn2,千层面,keras,theanets ...

于 2015-04-29T13:19:25.770 回答