Children (light blue) and parents(dark blue) run in the same generation simulation. The red car is the current best individual, the white ones are the old best individual (evaluated in the same generation).
So I finally implemented differential evolution in my Carvatar project (commit f6aba7e ).
If you followed my other articles about using differential evolution algorithm to fit data to a polynomial you probably are up to par with the theory :) . Of course I implemented it in C++ now and for my real-time simulator so there are a few modifications we will discuss.
First I feel like I ignored the taxonomy part of the algorithm description, the main reason for that is that I consider taxonomy of optimization algorithms to be kind of a waste of time. Nonetheless it has it’s merits for remembering the algorithm and adding a story to our numbers.
So the evolution algorithms story is : We have a population of individuals (each individual has a genome ), they mate and make babies. A baby has a genome that is a combination of it’s parents genomes and it may or may not have a small mutation (modification that is not present in the parents, this happens in nature because the ADN and ARN copy is not perfect). After a generation has passed only the fittest members remain and go on to reproduce. So basically it’s a natural algorithm that maximizes the fitness function.
There are many evolution algorithms variations (based on the way we select the individuals that reproduce, what individuals get to remain in the population, the way they reproduce). The “normal” reproduction is based on selecting a crossover point and mixing the genomes, but this also varies. In our case we have genomes that have real values so to search the space optimally it’s kind of clear we need the reproduction process to be more complex then just crossover points and switching variables, so our reproduction algorithm implies 3 individuals (yeah.. so the story starts to get kind of awkward :)) ) . We make a genome between 2 individuals by a sort of differential average and then that genome gets the normal crossover with a third. Some birds require 3 partners to mate but the algorithm kind of steers from the story. Because it does not stick to the story it actually gets good results for optimizing real values.
Ok so enough theory (for a better understanding, check my previous articles or the internet :) ). Let’s get down to business.
The first AI problem we want to solve using evolution is: “Find the optimal parameters for our basic AI controller that get our best race driver (best time and less wall hits)”. Solving this problem should give us an unbeatable AI (almost perfect) and it’s trajectory will be our best race line (a better approximation that we get by using a smooth filter).
The main steps for implementing this in our real-time system are:
Mate individuals and create Children
Run the race simulation to get the fitness for each individual
Replace the parents with the children with better fitness scores (again this doesn’t fit the natural story :) )
We will achieve this by implementing a simple FSM (state machine).
We will inherit from our RaceManager because the whole thing will actually be a special race for life that is always repeating :) .
Lets’s check our init function:
voidDifferentialEvolution::init(){//Read the values from our TOML fileloadRaceFromTOML("racesetup.TOML");//Set the first state to create childrenm_currentState=DE_CREATECHILDREN;//All our racers (parents and children)m_numRacers=m_populationSize*2;// Creating the controllers, and carsm_controllers=newIController*[m_numRacers];m_childrenControllers=&m_controllers[m_populationSize];for(unsignedinti=0;i<m_numRacers;i++){TopdownCar*car=newTopdownCar(i,b2Color(0.3f,0.5f,0.8f));car->setPosition(TRACK->getSectorPoint(0).center);m_controllers[i]=newBasicAIController();m_controllers[i]->initController(car);// Our first guys are the parentsif(i<=m_populationSize){floatparamsInit[EBASICAI_NUM];//We init the parameters with smart random values// we could do this with actually random values, without knowing the interval// but that implies a bigger population size and a lower convergence rateparamsInit[EBASICAI_ANGLETOTURN]=randomInterval(0,60);paramsInit[EBASICAI_LOOKAHEAD_DISTANCE]=randomInterval(2,20);paramsInit[EBASICAI_ANGLETOTURNSPEEDINFLUENCE]=1.0f/randomInterval(1,10000);paramsInit[EBASICAI_MAXSPEED]=randomInterval(0,110)/100.0f;paramsInit[EBASICAI_DISTANCETOFRONTWALL_STOP]=1.0f/randomInterval(1,1000);//Set our parameters((BasicAIController*)m_controllers[i])->setParams(paramsInit);}else// These are the children{m_controllers[i]->getCar()->setDebugColor(b2Color(0.6f,0.6f,0.6f));//children}}// We don't have a best candidate yetm_bestCandidate=NULL;}
We create all our controllers and cars. We have 2 * population size racers because we want to have the parents race their children so we can see if they are better or not. We hold the parents first and then our children. We also use a pointer to hold the beginning of our children array so it’s all clean and easy to follow.
The loadRaceFromTOML function just loads the parameters from the toml file. I’ll paste it here so you can follow the parameters through the code:
1 2 3 4 5 6 7 8 9101112131415
voidDifferentialEvolution::loadRaceFromTOML(constchar*filename){// Load our parameters from the TOML filestd::ifstreamifs(filename);toml::Parserparser(ifs);toml::ValuedocumentRoot=parser.parse();toml::Value*raceSettings=documentRoot.find("DifferentialEvolution");m_numLaps=raceSettings->find("lap_number")->as<int>();m_maxRaceTime=raceSettings->find("max_race_time")->as<int>();m_populationSize=raceSettings->find("population_size")->as<int>();m_maxIterations=raceSettings->find("max_iterations")->as<int>();m_differentialWeight=raceSettings->find("differential_weight")->as<double>();m_crossoverProbability=raceSettings->find("crossover_probability")->as<double>();}
Our state machine looks like this (Note it is as simple as possible, for bigger FSMs it would be better to separate the state transitions from the state execution code):
voidDifferentialEvolution::updateControllers(){switch(m_currentState){caseDE_CREATECHILDREN:{// Decrease our iteration numberm_maxIterations--;// Create the actual childrenstepDifferential();if(m_maxIterations>0){// We go into the simulation phasem_currentState=DE_RUNSIMULATION;m_isRaceEnded=false;m_currentRaceTicks=0;}else{// We need to stop (we reached our number of generations)m_currentState=DE_END;printf("Best candidate params are: ");prinfvector(m_bestCandidate->getParams(),EBASICAI_NUM);printf("Best candidate params are: ");}break;}caseDE_RUNSIMULATION:{// We run the simulation (race) until it endsRaceManager::updateControllers();// Color the best individual (eyecandy :) )colorBestFit();if(m_isRaceEnded==true){m_currentState=DE_EVOLVE;}break;}caseDE_EVOLVE:{evolve();//Reset racersfor(unsignedinti=0;i<m_numRacers;i++){TopdownCar*car=m_controllers[i]->getCar();car->setPosition(TRACK->getSectorPoint(0).center);car->reset();}m_currentState=DE_CREATECHILDREN;break;}caseDE_END:{// DO NOTHINGbreak;}}}
Ok so the meat of the Differential algorithm is the stepDifferential function. Where we actually create children for each parent by considering their BasicAiController parameters as genomes:
voidDifferentialEvolution::stepDifferential(){for(unsignedinti=0;i<m_populationSize;i++){intpickIdx[3];pickIdx[0]=pickUnique(pickIdx,0);pickIdx[1]=pickUnique(pickIdx,1);pickIdx[2]=pickUnique(pickIdx,2);// pick random RintR=randomInterval(0,EBASICAI_NUM);BasicAIController*x=(BasicAIController*)m_controllers[i];// CrossoverBasicAIController*individualA=(BasicAIController*)m_controllers[pickIdx[0]];BasicAIController*individualB=(BasicAIController*)m_controllers[pickIdx[1]];BasicAIController*individualC=(BasicAIController*)m_controllers[pickIdx[2]];floatnewParams[EBASICAI_NUM];memcpy(newParams,x->getParams(),sizeof(newParams));for(unsignedintj=0;j<EBASICAI_NUM;j++){floatpickRi=randomInterval(0,10000)/10000.0f;if(pickRi<m_crossoverProbability||j==R){newParams[j]=individualA->getParams()[j]+m_differentialWeight*(individualB->getParams()[j]-individualC->getParams()[j]);}}((BasicAIController*)m_childrenControllers[i])->setParams(newParams);((BasicAIController*)m_childrenControllers[i])->getCar()->setDebugColor(b2Color(0.6f,0.6f,0.6f));}}
The pickUnique function is similar to the one we wrote in javascript. Picks a random index that was not picked before (not in the blackList array):
We may change the ComputeFitness function as we wish (by changing the weights parameters). The fitness values we use are actually the number of crashes, the inverse time the car took to make a lap and the distance it managed to travel. I feel that the empiricalWeights values should be read from a TOML but I am not yet convinced of this :) .
So this was it, enjoy my results in the youtube video above.