3

I'm supposed to implement an artificial neural network (ANN) with 2 input, 2 hidden and 1 output neuron that can solve the XOR problem. The weights of the network should be optimized using an evolutionary algorithm. The activation function for each neuron and the fitness function for each ANN are given. The following picture sums up the problem and introduces the variable names I used:

enter image description here

Now I tried my very best to solve the problem, but even with an evolutionary algorithm using a population size of 1000 ANNs and 2000 generations my best fitness is never better than 0.75. My code includes a ANN class with the neurons, activation and fitness function and a Main class that includes the evolutionary algorithm and that optimizes the weights for the ANNs. Here is the code:

Each ANN is initialized with random weights between -1 and 1 and able to mutate, i.e. return a mutation that differs in one weight wich is chosen randomly.

public class ANN implements Comparable<ANN> {
    private Random rand = new Random();
    public double[] w = new double[6];  //weights: in1->h1, in1->h2, in2->h1, in2->h2, h1->out, h2->out

    public ANN() {
        for (int i=0; i<6; i++) //randomly initialize weights in [-1,1)
            w[i] = rand.nextDouble() * 2 - 1;
    }

    //calculates the output for input a & b
    public double ann(double a, double b) {
        double h1 = activationFunc(a*w[0] + b*w[2]);
        double h2 = activationFunc(a*w[1] + b*w[3]);
        double out = activationFunc(h1*w[4] + h2*w[5]);

        return out;
    }

    private double activationFunc(double x) {
        return 2.0 / (1 + Math.exp(-2*x)) - 1;
    }

    //calculates the fitness (divergence to the right output)
    public double fitness() {
        double sum = 0;
        //test all possible inputs (0,0; 0,1; 1,0; 1,1)
        sum += 1 - Math.abs(0 - ann(0, 0));
        sum += 1 - Math.abs(1 - ann(0, 1));
        sum += 1 - Math.abs(1 - ann(1, 0));
        sum += 1 - Math.abs(0 - ann(1, 1));
        return sum / 4.0;
    }

    //randomly change random weight and return the mutated ANN
    public ANN mutate() {
        //copy weights
        ANN mutation = new ANN();
        for (int i=0; i<6; i++)
            mutation.w[i] = w[i];

        //randomly change one
        int weight = rand.nextInt(6);
        mutation.w[weight] = rand.nextDouble() * 2 - 1;

        return mutation;
    }

    @Override
    public int compareTo(ANN arg) {
        if (this.fitness() < arg.fitness())
            return -1;
        if (this.fitness() == arg.fitness())
            return 0;
        return 1;   //this.fitness > arg.fitness
    }

    @Override
    public boolean equals(Object obj) {
        if (obj == null)
            return false;
        ANN ann = (ANN)obj;
        for (int i=0; i<w.length; i++) {    //not equal if any weight is different
            if (w[i] != ann.w[i])
                return false;
        }
        return true;
    }
}

The Main class has the evolutionary algorithm and uses elitism and rank-based selection to create the next generation of each population, i.e. the 100 best ANNs are copied the remaining 900 are mutations of previously successful ANNs.

//rank-based selection + elitism
public class Main {
    static Random rand = new Random();
    static int size = 1000;                     //population size
    static int elitists = 100;                  //number of elitists

    public static void main(String[] args) {
        int generation = 0;
        ArrayList<ANN> population = initPopulation();
        print(population, generation);

        //stop after good fitness is reached or after 2000 generations
        while(bestFitness(population) < 0.8 && generation < 2000) {
            generation++;
            population = nextGeneration(population);
            print(population, generation);
        }
    }

    public static ArrayList<ANN> initPopulation() {
        ArrayList<ANN> population = new ArrayList<ANN>();
        for (int i=0; i<size; i++) {
            ANN ann = new ANN();
            if (!population.contains(ann))  //no duplicates
                population.add(ann);
        }
        return population;
    }

    public static ArrayList<ANN> nextGeneration(ArrayList<ANN> current) {
        ArrayList<ANN> next = new ArrayList<ANN>();
        Collections.sort(current, Collections.reverseOrder());  //sort according to fitness (0=best, 999=worst)

        //copy elitists
        for (int i=0; i<elitists; i++) {
            next.add(current.get(i));
        }

        //rank-based roulette wheel
        while (next.size() < size) {                        //keep same population size
            double total = 0;
            for (int i=0; i<size; i++)
                total += 1.0 / (i + 1.0);                   //fitness = 1/(rank+1)

            double r = rand.nextDouble() * total;
            double cap = 0;
            for (int i=0; i<size; i++) {
                cap += 1.0 / (i + 1.0);                     //higher rank => higher probability
                if (r < cap) {                              //select for mutation
                    ANN mutation = current.get(i).mutate(); //no duplicates
                    if (!next.contains(mutation))
                        next.add(mutation);     
                    break;
                }
            }
        }       

        return next;
    }

    //returns best ANN in the specified population
    public static ANN best(ArrayList<ANN> population) {
        Collections.sort(population, Collections.reverseOrder());
        return population.get(0);
    }

    //returns the best fitness of the specified population
    public static double bestFitness(ArrayList<ANN> population) {
        return best(population).fitness();
    }

    //returns the average fitness of the specified population
    public static double averageFitness(ArrayList<ANN> population) {
        double totalFitness = 0;
        for (int i=0; i<size; i++)
            totalFitness += population.get(i).fitness();
        double average = totalFitness / size;
        return average;
    }

    //print population best and average fitness
    public static void print(ArrayList<ANN> population, int generation) {       
        System.out.println("Generation: " + generation + "\nBest: " + bestFitness(population) + ", average: " + averageFitness(population));
        System.out.print("Best weights: ");
        ANN best = best(population);
        for (int i=0; i<best.w.length; i++)
            System.out.print(best.w[i] + " ");
        System.out.println();
        System.out.println();
    }
}

Even though, I put quite some thought into this and used techniques that I learned, the result is not satisfying. For some reason the optimal weights seem to drift to -1 for each weight. How does that make sense? Is the range of -1 to 1 for the weights a good choice? Should I also introduce crossovers in addition to the mutations? I know this is a very specific problem, but I would greatly appreciate some help!

4

1 回答 1

2

网络结构不对。如果没有每个节点的偏差或阈值,这个网络就无法解决 XOR 问题。

一个隐藏节点应该对 OR 进行编码,而另一个隐藏节点应该对 AND 进行编码。然后输出节点可以对异或问题进行编码,即 OR 隐藏节点为正,AND 隐藏节点为负。只有当 OR 隐藏节点被激活而 AND 隐藏节点未被激活时,才会产生积极的结果。

我还会增加权重的界限,让 EA 自己找出来。但如果有必要,这取决于网络结构。

如果您想将此网络与隐藏节点和阈值一起使用,请参阅:http ://www.heatonresearch.com/online/introduction-neural-networks-java-edition-2/chapter-1/page4.html

如果您想使用另一个有偏见的网络,请参阅: http: //www.mind.ilstu.edu/curriculum/artificial_neural_net/xor_problem_and_solution.php

于 2015-06-04T16:19:01.523 回答