GA written in Java

2019-01-31 14:51发布

问题:

I am attempting to write a Genetic Algorithm based on techniques I had picked up from the book "AI Techniques for Game Programmers" that uses a binary encoding and fitness proportionate selection (also known as roulette wheel selection) on the genes of the population that are randomly generated within the program in a two-dimensional array.

I recently came across a piece of pseudocode and have tried to implement it, but have come across some problems with the specifics of what I need to be doing. I've checked a number of books and some open-source code and am still struggling to progress. I understand that I have to get the sum of the total fitness of the population, pick a random number between the sum and zero, then if the number is greater than the parents to overwrite it, but I am struggling with the implementation of these ideas.

Any help in the implementation of these ideas would be very much appreciated as my Java is rusty.

回答1:

The following is a complete outline of the GA. I made sure to be very detailed so it can be easily coded to C/Java/Python/..

/* 1. Init population */
POP_SIZE = number of individuals in the population
pop = newPop = []
for i=1 to POP_SIZE {
    pop.add( getRandomIndividual() )
}

/* 2. evaluate current population */
totalFitness = 0
for i=1 to POP_SIZE {
    fitness = pop[i].evaluate()
    totalFitness += fitness
}

while not end_condition (best fitness, #iterations, no improvement...)
{
    // build new population
    // optional: Elitism: copy best K from current pop to newPop
    while newPop.size()<POP_SIZE
    {
        /* 3. roulette wheel selection */
        // select 1st individual
        rnd = getRandomDouble([0,1]) * totalFitness
        for(idx=0; idx<POP_SIZE && rnd>0; idx++) {
            rnd -= pop[idx].fitness
        }
        indiv1 = pop[idx-1]
        // select 2nd individual
        rnd = getRandomDouble([0,1]) * totalFitness
        for(idx=0; idx<POP_SIZE && rnd>0; idx++) {
            rnd -= pop[idx].fitness
        }
        indiv2 = pop[idx-1]

        /* 4. crossover */
        indiv1, indiv2 = crossover(indiv1, indiv2)

        /* 5. mutation */
        indiv1.mutate()
        indiv2.mutate()

        // add to new population
        newPop.add(indiv1)
        newPop.add(indiv2)
    }
    pop = newPop
    newPop = []

    /* re-evaluate current population */
    totalFitness = 0
    for i=1 to POP_SIZE {
        fitness = pop[i].evaluate()
        totalFitness += fitness
    }
}

// return best genome
bestIndividual = pop.bestIndiv()     // max/min fitness indiv

Note that currently you're missing a fitness function (depends on the domain). The crossover would be a simple one point crossover (since you are using a binary representation). Mutation could be a simple flip of a bit at random.


EDIT: I have implemented the above pseudocode in Java taking into consideration your current code structure and notations (keep in mind i am more of a c/c++ guy than java). Note this is in no way the most efficient or complete implementation, I admit I wrote it rather quickly:

Individual.java

import java.util.Random;

public class Individual
{
    public static final int SIZE = 500;
    private int[] genes = new int[SIZE];
    private int fitnessValue;

    public Individual() {}

    public int getFitnessValue() {
        return fitnessValue;
    }

    public void setFitnessValue(int fitnessValue) {
        this.fitnessValue = fitnessValue;
    }

    public int getGene(int index) {
        return genes[index];
    }

    public void setGene(int index, int gene) {
        this.genes[index] = gene;
    }

    public void randGenes() {
        Random rand = new Random();
        for(int i=0; i<SIZE; ++i) {
            this.setGene(i, rand.nextInt(2));
        }
    }

    public void mutate() {
        Random rand = new Random();
        int index = rand.nextInt(SIZE);
        this.setGene(index, 1-this.getGene(index));    // flip
    }

    public int evaluate() {
        int fitness = 0;
        for(int i=0; i<SIZE; ++i) {
            fitness += this.getGene(i);
        }
        this.setFitnessValue(fitness);

        return fitness;
    }
}

Population.java

import java.util.Random;

public class Population
{
    final static int ELITISM_K = 5;
    final static int POP_SIZE = 200 + ELITISM_K;  // population size
    final static int MAX_ITER = 2000;             // max number of iterations
    final static double MUTATION_RATE = 0.05;     // probability of mutation
    final static double CROSSOVER_RATE = 0.7;     // probability of crossover

    private static Random m_rand = new Random();  // random-number generator
    private Individual[] m_population;
    private double totalFitness;

    public Population() {
        m_population = new Individual[POP_SIZE];

        // init population
        for (int i = 0; i < POP_SIZE; i++) {
            m_population[i] = new Individual();
            m_population[i].randGenes();
        }

        // evaluate current population
        this.evaluate();
    }

    public void setPopulation(Individual[] newPop) {
        // this.m_population = newPop;
        System.arraycopy(newPop, 0, this.m_population, 0, POP_SIZE);
    }

    public Individual[] getPopulation() {
        return this.m_population;
    }

    public double evaluate() {
        this.totalFitness = 0.0;
        for (int i = 0; i < POP_SIZE; i++) {
            this.totalFitness += m_population[i].evaluate();
        }
        return this.totalFitness;
    }

    public Individual rouletteWheelSelection() {
        double randNum = m_rand.nextDouble() * this.totalFitness;
        int idx;
        for (idx=0; idx<POP_SIZE && randNum>0; ++idx) {
            randNum -= m_population[idx].getFitnessValue();
        }
        return m_population[idx-1];
    }

    public Individual findBestIndividual() {
        int idxMax = 0, idxMin = 0;
        double currentMax = 0.0;
        double currentMin = 1.0;
        double currentVal;

        for (int idx=0; idx<POP_SIZE; ++idx) {
            currentVal = m_population[idx].getFitnessValue();
            if (currentMax < currentMin) {
                currentMax = currentMin = currentVal;
                idxMax = idxMin = idx;
            }
            if (currentVal > currentMax) {
                currentMax = currentVal;
                idxMax = idx;
            }
            if (currentVal < currentMin) {
                currentMin = currentVal;
                idxMin = idx;
            }
        }

        //return m_population[idxMin];      // minimization
        return m_population[idxMax];        // maximization
    }

    public static Individual[] crossover(Individual indiv1,Individual indiv2) {
        Individual[] newIndiv = new Individual[2];
        newIndiv[0] = new Individual();
        newIndiv[1] = new Individual();

        int randPoint = m_rand.nextInt(Individual.SIZE);
        int i;
        for (i=0; i<randPoint; ++i) {
            newIndiv[0].setGene(i, indiv1.getGene(i));
            newIndiv[1].setGene(i, indiv2.getGene(i));
        }
        for (; i<Individual.SIZE; ++i) {
            newIndiv[0].setGene(i, indiv2.getGene(i));
            newIndiv[1].setGene(i, indiv1.getGene(i));
        }

        return newIndiv;
    }


    public static void main(String[] args) {
        Population pop = new Population();
        Individual[] newPop = new Individual[POP_SIZE];
        Individual[] indiv = new Individual[2];

        // current population
        System.out.print("Total Fitness = " + pop.totalFitness);
        System.out.println(" ; Best Fitness = " + 
            pop.findBestIndividual().getFitnessValue());

        // main loop
        int count;
        for (int iter = 0; iter < MAX_ITER; iter++) {
            count = 0;

            // Elitism
            for (int i=0; i<ELITISM_K; ++i) {
                newPop[count] = pop.findBestIndividual();
                count++;
            }

            // build new Population
            while (count < POP_SIZE) {
                // Selection
                indiv[0] = pop.rouletteWheelSelection();
                indiv[1] = pop.rouletteWheelSelection();

                // Crossover
                if ( m_rand.nextDouble() < CROSSOVER_RATE ) {
                    indiv = crossover(indiv[0], indiv[1]);
                }

                // Mutation
                if ( m_rand.nextDouble() < MUTATION_RATE ) {
                    indiv[0].mutate();
                }
                if ( m_rand.nextDouble() < MUTATION_RATE ) {
                    indiv[1].mutate();
                }

                // add to new population
                newPop[count] = indiv[0];
                newPop[count+1] = indiv[1];
                count += 2;
            }
            pop.setPopulation(newPop);

            // reevaluate current population
            pop.evaluate();
            System.out.print("Total Fitness = " + pop.totalFitness);
            System.out.println(" ; Best Fitness = " +
                pop.findBestIndividual().getFitnessValue()); 
        }

        // best indiv
        Individual bestIndiv = pop.findBestIndividual();
    }
}


回答2:

Why not use an open-source Framework like JGAP: http://jgap.sf.net



回答3:

I have implemented this algorithm by creating a "cumulative fitness array" and binary search, thus reducing the need to iterate through each element in the array during the selection:

  1. For population size N create cumulative fitness array: arr[N].
  2. Set arr[0] := computeFitness(individual[0]).
  3. Then, for each subsequent element: X, arr[X] = arr[X-1] + computeFitness(individual[X]).
  4. Generate a random number between 0 and arr[N] (i.e. the total fitness).
  5. Use a binary search (e.g. Collections.binarySearch) to locate the appropriate index in the cumulative fitness array, and use this index to select the individual.

Note that you only need to create the fitness array at the start of the reproduction phase, and can then re-use it multiple times to perform selections in O(log N) time.

As an aside, note that tournament selection is far easier to implement!



回答4:

The concept you're looking for is called "roulette wheel selection." You don't have an established fitness function yet (you may be implying that the fitness of each individual is the integral value of its chromosome), but when you do a general plan is:

  1. Sum the fitness of the entire population.
  2. Get a random number (call it x) between 0 and the total fitness.
  3. Iterate through the population. For each member:
    1. Subtract the member's fitness from x.
    2. If x is now less or equal to zero, select the current member.

There are other equivalent implementations, but the general idea is to select members with a probability proportional to their fitness.

Edit: A few notes on fitness functions. The representation of a chromosome (in your case as a 32-bit integer) is independent of the fitness function used to evaluate it. For example, binary encodings typically treat the chromosome as a set of bitfields packed into an integral value of appropriate size. Crossover and mutation can then be accomplished by the appropriate bit-masking operations. If you're interested, I can post some simple GA code I have laying around (in C and Python) which uses bitwise operations to implement these functions. I'm not sure how comfortable you are with these languages.



回答5:

I made an extensible implementation in java, in which operators and individual structure is well defined by interfaces that work together. Github repo here https://github.com/juanmf/ga

It has a standard implementation for each operator, and an example problem implementation with a particular Individual/Population structure and a Fitness meter. The example problem Implementation is to find the a good soccer team with players among 20 teams and a budget restriction.

To adapt it to your current problem you need to provide implementations of these interfaces:

juanmf.ga.structure.Gen;
juanmf.ga.structure.Individual;
juanmf.ga.structure.IndividualFactory; 
juanmf.ga.structure.Population;  // Has a basic implementation already
juanmf.ga.structure.PopulationFactory;

In pkg juanmf.grandt you have the example problem implementation classes, and how to publish them, as shown in the code snippet below.

To publish you implementations you just have to return the proper classes from this Spring beans:

/**
 * Make sure @ComponentScan("pkg") in juanmf.ga.App.java includes this class' pkg 
 * so that these beans get registered.
 */
@Configuration
public class Config {

    @Bean(name="individualFactory")
    public IndividualFactory getIndividualFactory() {
        return new Team.TeamFactory();
    }

    @Bean(name="populationFactory")
    public PopulationFactory getPopulationFactory() {
        return new Team.TeamPopulationFactory();
    }

    @Bean(name="fitnessMeter")
    public FitnessMeter getFitnessMeter() {
        return new TeamAptitudeMeter();
    }
} 

Crosser operator has two implementations for the same technique, one sequential and one concurrent which outperforms sequential by far.

Stop condition can be specified. If none is given, it has a default stop condition that stops after 100 generations with no improvements (here you must be careful with elitist, not to loose the best of each generation so as to trigger this stop condition effectively).

So if anyone is willing to give it a try, I'd be glad to help. Anyone is welcome to offer suggestions, and better yet operator implementations :D or any improving pull request.



回答6:

These other questions about roulette wheel selection should help:

  • Roulette wheel selection algorithm
  • Roulette selection in genetic algorithms

In the first one, I've tried to explain how the roulette wheel works. In the second, Jarod Elliott has provided some pseudocode. Combined with Adamski's description of an efficient implementation, these should be sufficient to get something working.



回答7:

Just a point worth mentioning. The Roulette wheel selection (as indicated in the pseudo-code) will not work for minimization problems - it is, however, valid for maximization problems.

Due to the manner in which the most fit individual is selected, the minimization case will not hold. Details are provided in: Computational Intelligence: 2nd edition