160 likes | 312 Views
GP Applications . Two main areas of research Testing genetic programming in areas other techniques have been applied to. Applying genetic programming to problems that have not been previously solved. Examples of applications Data mining Image processing Computer graphics
E N D
GP Applications • Two main areas of research • Testing genetic programming in areas other techniques have been applied to. • Applying genetic programming to problems that have not been previously solved. • Examples of applications • Data mining • Image processing • Computer graphics • Natural language processing • Board games
GP Parameters • Standard parameters • Population size (M) • Number of generations • Tournament size • Application rates for each of the genetic operators • Maximum tree depth • Bound • Maximum offspring size • These parameters are varied in an attempt to find a solution.
Reporting on GP Results • GP reports must specify: • A description of the objective. • The terminal set used. • The function set used. • The fitness cases used. • The raw fitness measure. • Hits criterion. • The population size. • The number of generations. • The success predicate used. • The method used to create the initial population. • The seeds of the random number generator used on successful runs together with the corresponding solution found on each of these runs.
Describing the Performance of a GP System • Hits histogram • Standardized fitness histogram • Structural complexity histogram • Variety histogram • Number of runs that must be performed to find a solution. • Calculating the computational effort needed to find a solution.
Creating a Hits Histogram • A hits histogram is created for each generation. • Count the number of individuals that have n hits. • n usuallyranges from 0 to the number of fitness cases • Plot the number of individuals with n hits against each n value.
Creating a Standardized Fitness Histogram • Illustrates the standardized of an entire run. • The standardized fitness is averaged for each generation. • Plot the average standardized fitness against each generation.
Creating a Structural Complexity Histogram • Illustrates the structural complexity over an entire run. • Calculate the size, i.e. the number of nodes, of each individual in a generation. • Calculate the average tree size for each generation. • Plot the average tree size for each generation.
Variety Variety = 60% Variety = 100%
Creating a Variety Histogram • Illustrates the variety over an entire run. • Calculate the variety percentage for each generation. • Plot the variety percentage against each generation.
Calculating the Number of Runs Needed • The probability, x, that a successful solution to a problem will be found in R independent runs of the GP algorithm. • x = 1 - ( 1 - P(M, i))R where P(M,i) is the cumulative probability of success by generation i, using population M. • P(M,i) is calculated by finding the total number of runs that succeeded before or ongeneration i and dividing this total by the total number of runs conducted. • The number of runs needed is given by: • R(x, M, i) = log ( 1 – x ) (note: the ceiling is taken) log (1 – P(M, I))
Calculating Computational Effort • The number of individuals that must be examined as part of the search for the generational control model, i.e. the computational effort, is then calculated using: • f(x, M, i) = R(x, M, i)*M*i • The following equation is used to calculate the number of individuals that will be examined in a steady-state system with a fixed population size: • f(x, M, i) = R(x, M, i) * i
Example How many runs are needed to find a solution with a 99% probability by generation 15? What is the computational effort needed?