Patrick Yelkenci – 500787007
Organisms have been evolving since the beginning of times. We always want to be able to simulate things so why not simulate natural selection and the evolution of organisms?
This report will be a guide of the steps that I have taken to create my project, simulating natural selection. I have chosen this subject because Charles Darwin’s theory on evolution interests me a lot. Survival of the fittest, where only the part of the population that has adapted far enough to their environment can survive and breed to produce even better offspring.
To be able to simulate this natural selection, I have done research on what exactly natural selection is, and to actually create this product I had to research and use genetic algorithms for actually making the natural selection. I chose this as a subject to get myself started in artificial intelligence. I have no prior experiences with genetic algorithms at all. I got inspired by a game called “Equilinox” and a youtube channel called “Primer”. In this prototype animals will have a few traits which the next generations can take over and get better at it. The animals most likely to survive will breed and hopefully create an even “better” offspring.
So in this blog i will answer the question, how can I simulate natural selection by using genetic algorithms?
Table of contents
- Natural Selection
- Genetic algorithms
- The project
- The process
Natural selection is a mechanism of evolution, organisms that are more attuned to the environment they live in are more likely to survive. These organisms are likely to pass on the genes that aided them in surviving to their offspring, which causes a species to change and perhaps even diverge over time. Genetic mutations that are beneficial to an individual’s survival are passed on through reproduction. This results in the next generation being even more likely to survive the same environment.
The concept “survival of the fittest” is a core concept in natural selection. This means that an organism that is the most “fit” will be the best-adapted to survive.
Genetic algorithms are algorithms derived from Charles Darwin’s theory of natural selection, or evolution. The algorithms reflect the process of the fittest organisms selected for reproduction. It is based on natural selection. Thus carrying on the fittest organisms will reproduce, and pass on their genes onto the next generation. Since only the fittest organisms reproduce, the best genes will be passed on to the next generation, these will survive easier and keep building the best genes possible. In real life there is always some sort of mutation going on with these genes, this is the reason the passed on genes are always a little bit different.
To simulate genetic algorithms, you first start with an initial population of organisms. These organisms all have different DNA strands filled with different genes.
When two organisms mate, the children will receive a combination of the genes of both parents. This is not an average of the two combined, but they will receive a full gene of parent 1, and the next gene for example of parent 2 and so on. This creates a diversity in the genes of the next generations. Averaging the genes does not happen in real life, and it also gives a wrong trait as the average will probably not be the best available gene.
A simulation of the crossover of genes, can happen in a few different manners. Since there is no real definition of how it happens in real life, these crossovers can be simulated in a 50/50 manner. This is where the chromosome is split in half, and both halves get offered to two different children, so half of parent-1’s genes go to child-1 and the rest of parent-1’s genes go to child-2, and the same with the other parent. This can also be done with a random crossover point, which works basically in the same way.
And then there is a fully random crossover, where for every gene, a “random” parent is chosen just like tossing a coin, and heads is parent one and tails is parent 2. This gives a fully randomized chromosome for the next generation.
So now that we know how the genes can get passed on to the next generations, there is another important factor to genetic algorithms. These are the mutations of the genes. Mutations are random changes to the DNA, These mutations can either be positive or negative.
Even the “bad” mutations are a good thing because they increase the diversity in the newer generations. Of course a bad mutation can make a less fit offspring, but they will eventually just die out. In turn a “good” mutation can enhance the fitness of an organism even more, making it adapt even better to the environment. In a genetic algorithm programme most of the time there will be drawbacks for having certain traits, so that there will always be a certain degree of weighing out traits against eachother. One of these drawbacks could be, the bigger the size, the more energy an organism uses.
There is an environment with a certain amount of food, in this environment an initial population of organisms is placed. Each of these organisms have their own DNA, which holds certain genes, or traits. These traits include the movement speed, the size of the organism and the sense(how far away they can sense food) of the organism. As mentioned before the environment has a certain amount of food which is always the same amount, so that there will always be food for the organisms.
Each of the organisms has a fitness score, this score starts at 0 and will increase per iteration that the organism is alive. Meaning the longer it lives, the higher the fitness and thus meaning that this organism can handle this certain environment better than organisms with a lower fitness score. The fitness score increases based on iterations. Every constant amount of time the iteration increases, and with this the fitness score of each organism alive. Once an organism reaches a certain fitness score they are available to mate with another organism.
Once two organisms that are available to mate are near eachother they will go towards eachother and when close enough they will mate. Each time two organisms mate, two children will come out. These new children will start with a fitness score of 0 again, and they will have carried over genes from both parents, the genes which they get from what parent are totally random. After this a mutation is applied to the genes, this can either be a a positive or negative mutation.
If the organisms do not have any energy, which is drained every second, left they will die. So the organisms that can keep finding food and regain energy and mate will evolve and get even better genes in the new generations.
In this chapter an explanation will be given in how the different parts of the programme written works, or if certain things did not work out. The following chapters will be in order in how the programme was written as well.
At first a simple environment for the organisms to live is generated. This is done by placing cubes, which are further referenced to as tiles. These tiles are placed in a procedurally generated order. These tiles are added to a list, and this list is used in the spawning of the organisms and the food as well. When done like this, it is also easy to increase the size of the environment by having a public variable for the fieldsize, which you can then easily increase or decrease.
When the tiles are all placed it is time to spawn food in the environment. Each tile has a script with a boolean in it. This boolean simply checks if there is already food on a tile, so there would be no duplicates. This was done because this was the easiest and laziest way to check this.
As mentioned before there is always a certain amount of food in the world, meaning that if a piece of food is eaten another piece will be spawned elsewhere in the world. The food is spawned by getting a random number between 0, and the length of the tile list. After that there is a small check on if that tile already has food, if that is not the case food will be spawned on this random picked tile. By doing it this way, all the food can be easily tracked, by using the same boolean if the tile has food. This is a method that is called in update, so the food will always keep spawning if there is less then the amount of food wanted in the world.
The organisms are created with by a populationhandler, which spawns the organisms and keeps track on which organisms are still alive, but more on this later. Each organism needs a few things, these are DNA, a fitness score, energy and energy drainage and last but not least a way for them to move.
Let’s start with the DNA first, DNA is a collective of hereditary information, stored in the form of chromosomes. Chromosomes exist of multiple genes, which in place are the pieces of information about a certain trait.
It goes without saying that the speed trait is how fast an organism moves. This gives organisms the ability to reach things faster than slower organisms, but in turn moving faster also costs more energy per second.
The sense trait is how far an organism can see around itself. Whether it would be food or an organism to mate with. The same goes for speed, if you can see food or an organism to mate earlier than another organism, you can reach it earlier.
The size, well it depicts the size of an organism, a benefit of being large is having more mass and being able to hinder or even push other organisms out of their way. Being bigger costs more energy, but these organisms can hinder another organism to reach the same piece of food.
For the DNA of the organisms, i gave all organisms a DNA script. This script holds the minimum and maximum values an organism can have of every trait. When the initial population is spawned, all organisms get a random value between the minimum and maximum per trait. These random values are then stored in an array in another script which handles the behaviour of the organism. How the newer generations get their genes will be explained later.
Now that the organisms have their traits, we can start on the movement of the organisms. The movement is handled a bit like how the food is placed on the map. While the organisms are wandering and have no “objective”, a random tile will be picked and the organism will wander to that location. When that location is reached it will pick another tile and wander to that tile. This continues until the organism spots either food, or when they are ready to mate, a mate. When this happend they will either move to the food or to the mate.
And as with all living beings, they use energy. The energy usage is based on the formula of kinetig energy which is. 1/2 * mass * velocity^2. So the bigger the mass or the faster the organism moves the more energy it will use. But since the energy cost would be very little it was changed a tiny bit to: 1/2 * mass * velocity^2 + 1, this gave a decent energy usage.
The moment an organism is born, it will start draining energy, and after that it will not stop until it has no energy left. If the organism has no energy left, it will die out of course.
The behaviour of the organisms is regulated by using behaviour states. While the organism is in a certain state, it’s behaviour changes. It switches from state to state regularly, and the usage of this is called a state machine. Wandering is the usual(idle) state, while wandering the behaviour contains looking for mates, looking for food and wandering around with the movement mentioned before. When an organisms spots a mate, it has to move towards it, which it will do in the move to mate state. When it is close enough it will enter the mate state and mate with another organism. When this has happened they will have a short mating cooldown time. From this state the organism can either go back to wandering, or eating depending on if there is any food nearby or not. As mentioned before energy will always drain, and it does not matter in what state the organism finds itself, this state cycle will be broken when there is no energy left. In other words death.
When organisms have reached a certain fitness score, they are ready to mate. As said earlier in this blog, the fitness score increases per iteration that they have lived. When this certain fitness score is reached the populationhandler will insert the organisms ready to mate into an array. The population handler checks if two mates are close enough, this happens by checking if the distance between two organisms is smaller than the biggest sense radius of the two. They then move towards eachother and are ready to mate. The mating is the based on genetic algorithms.
So when the organisms are mating, two arrays are made, one for the genes of the first child and one for the second child. First these arrays will be filled by randomly selecting a parent for each gene, and taking that gene straight from the parent. This is the crossover method used, a fully random gene selection from either parent. So now that both children have their genes, we are ready to start mutating them, because otherwise there will be no diversity.
For the mutation of the genes, there are two variables, the mutation minimum of -1 and the mutation maximum of 1. Then i will take a random number between these values. This random number will then be added to the genes the children already have and thus changing it for better or worse. To make sure that they won’t get a trait that is immensely big, a max value of the biggest value of all traits is also set in the DNA script.
The offspring is then spawned into the world and will start to live just like the other organisms. The parents will then go back to wandering or going after food, and they can’t mate for a while, after a certain amount of time they are ready to mate again.
The initial hypothesis is that a healthy mix of speed and sense while being a smaller size will be the best organism traits. Also the simulation has to be running and seeing changes in the dna traits of the organisms.
The result is that the simulation is indeed working and the organisms are evolving, the size decreases after a few generations and the speed increases. One of the factors that surprised me the most is that the sense went to the highest it could go. Eventually all the organisms are small, fast and have a big sense radius. But the most important part is that there is actually a visible form of evolution.
The question mentioned in the beginning: “how can I simulate natural selection by using genetic algorithms?” has been answered. I started doing research on how genetic algorithms worked, since i already had an understanding of natural selection i did not have to research this. After researching genetic algorithms, and thinking i had understood everything i started working on all the components talked about in this blog. Even though i understood how genetic algorithms were supposed to work, i still underestimated on how much work it would be, and would have loved to have started on the actual programming earlier.
There were a lot of things that could still be done to improve the simulation, for one adding a predator to the simulation would probably change the whole outcome. Adding a predator makes it so that there is another way for the organisms to die, and another thing that they have to adapt to. This may have a big impact on the evolution of the traits in the later iterations of the simulation.
A split environment is also one of the things that could change the end result of the simulation. Of course only if both environments have different settings, which would also make it fun to see if the same traits will evolve most or if there are no differences at all. You could even mix the two together, split the environment and fill one with a predator as well and the other with just the base organism to have a side by side comparison. This way you can see if the predators really have a big impact or not.
The last thing to change could be adding neural networks to each of the organisms. This way they have to learn everything by themselves and this will make the simulation a lot more natural. The organisms that fail to learn in time will die, and the ones that do learn will live. This way the organism makes the choices “by itself” without the programme giving any inputs.
Literature and sources
Mallawaarachchi, V. (2020, 1 maart). Introduction to Genetic Algorithms — Including Example Code. Medium. https://towardsdatascience.com/introduction-to-genetic-algorithms-including-example-code-e396e98d8bf3
Primer. (2018, 15 november). Simulating Natural Selection [Video]. YouTube. https://www.youtube.com/watch?v=0ZGbIKd0XrM&t=265s
Shiffman, D. (z.d.). The Nature of Code. Nature of Code. https://natureofcode.com/book/chapter-9-the-evolution-of-code/