Genetic Sudoku Algorithm - Optimizing Mutation
I am in the process of writing a genetic algorithm for solving Sudoku puzzles and was hoping for some input. The algorithm solves puzzles sometimes (about 1 in 10 times on the same puzzle with a maximum iteration of 1,000,000) and I am trying to get a small input in terms of mutation, repopulation and splicing rates. Any input is greatly appreciated as it is completely new to me and I feel like I am not doing things 100% right.
Brief overview of the algorithm
Counts the number of unique values for the numbers 1 through 9 in each column, row, and 3 * 3. Each of these unique values in the subsets is summed and divided by 9, resulting in a float value between 0 and 1. The sum of these values is divisible by 27, providing an overall fitness value ranging from 0 to 1. 1 indicates a puzzle to be solved.
Population size: 100
Roulette. Each node is randomly selected when the nodes containing higher fitness values have a slightly better chance of being selected
Reproduction: Two randomly selected chromosomes / planks replace a randomly selected subset (substrings of rows, columns or 3 * 3). The choice of a subset (which row, column or field) is random. The resulting boards are introduced into the population.
Reproduction rate: 12% of the population per cycle For each iteration, six reproductions, resulting in 12 new chromosomes per cycle of the algorithm.
Mutation: The mutation occurs at a rate of 2 percent of the population after 10 iterations with no improvement in the highest physical condition. Listed below are three mutation methods that have different weights for the probability of selection.
1: swap randomly selected numbers. The method takes two random numbers and swaps them across the board. This method appears to have the greatest impact on growth in the early stages of algorithm growth. 25% chance to choose
2: Make random changes: randomly select two cells and change their values. This method seems to help reduce the algorithm to convergence. % 65 chance to choose
3: Count the number of each value on the board. The allowed board contains the number 9 from every number from 1 to 9. This method takes any number that occurs less than 9 times and randomly changes it with a number that occurs more than 9 times. This seems to have a positive effect on the algorithm, but is only used sparingly. % 10 chance to choose
My main question is at what rate should I apply the mutation method. It seems that by increasing the mutation, I have faster initial results. However, as the result approaches the correct result, I think that the higher rate of change introduces too many bad chromosomes and genes into the population. However, at a lower rate of change, the algorithm seems to converge too early.
The final question is whether there is a better approach to mutation.
source to share
You can reverse the mutation rate over time to get the convergence behavior you described. But I actually think there will probably be more benefits if you change other parts of your algorithm.
The selection of the roulette wheel is applied to a very high degree of selection pressure. This tends to cause a fairly rapid loss of diversity quite early in the process. Choosing a binary tournament is usually the best place to start experimenting. It is a more gradual form of pressure and, just as important, it is much better controlled.
With a less aggressive selection mechanism, you can afford to produce more offspring since you don't have to worry about creating so many close copies of the best one or two people. Instead of 12% of the population producing offspring (perhaps less due to the repetition of the parents in the mating pool), I would go with 100%. You don't have to literally make sure every parent is involved, but just generate the same number of children as you have parents.
Some form of moderate elitism will probably be helpful then, so you don't lose good parents. Maybe keep the best 2-5 of the parent population if they are better than the worst 2-5 children.
With elitism, you can use a higher mutation rate. All three operators look useful. (Note that # 3 is actually a form of local search built into your genetic algorithm, which is often a huge performance win. In fact, you can extend # 3 to a much more complex method that loops until he won't be able to figure out how to make further improvements.)
I don't see an obvious best / worst set of weights for your three mutation operators. I think at this point you are firmly in the realm of experimental parameter tuning. Another idea is to put some knowledge into the process and say, for example, that early in the process, you choose between them at random. Later, as the algorithm gets closer, use the mutation operators that you think are more likely to help you complete the "nearly resolved" boards.
source to share
I once made a pretty competent Sudoku solver using GA. The details are reported (including various views and mutations) here: http://fakeguido.blogspot.com/2010/05/solving-sudoku-with-genetic-algorithms.html
source to share