Pikaia

Download here: http://astro.ufl.edu/~cmancone/pros/pikaia.pro

This is an IDL port of the Pikaia optimization algorithm. This algorithm was developed and is maintained by the guys at the High Altitude Observatory. This port is from the original version of pikaia, 1.0. This IDL version is optimized for speed with built-in IDL array operations, effecting a roughly 20x speed up from a direct transliteration to IDL. Of course, the original fortran is probably still 10-20 times faster than even this port, so if you are really worried about speed, use fortran. I made a couple changes to the original algorithm as I was speeding stuff up.

1). Most importantly, the elitism flag and the reproduction plan are completely ignored. The only reproduction plan I have included in this port is the steady-state replace worst option. This is done no matter what else you specify. Since the elitism flag doesn't apply to this reproduction plan, it is also ignored.

2). There is no clone protection - i.e. when parents are chosen, a parent can be chosen to reproduce with itself. Although I haven't tested to see what effect this has, I imagine it will be very small for larger populations.

3). By default the user supplied function 'func' is called in order to calculate the fitness of a particular set of values. However, the name of this function can be changed with the new 'fname' keyword to pikaia.

4). Through the use of the _extra keyword you can pass additional parameters through pikaia to 'func'. These can be retrieved by using the _extra keyword in your function declaration. I've included an example below, and there are more details in the idl help.

5). The data passed to the user supplied function, 'func' has been changed. Originally, pikaia would call:
fitness = func(n,parameters)
With parameters being an array of size n, specifying one point in the search space for which the fitness should be calculated. Now, the same calling sequence is used but parameters is now an nxm array where m is an arbitrary number, and each row in this array corresponds to an independent point in the search space for which the fitness must be calculated. Essentially, rather than processing one set of parameters at a time pikaia now passes large numbers of parameters to func and expects it to calculate the fitness of all of them in one go. Once func calculates the fitness of all the points it has been passed, func should return the fitness of every single point in a row vector of size mx1.

Some Considerations

One of the peculiar quirks of IDL is that it's for loops are very slow. This is due in part because IDL will stop in between loops to check for input and perform other tasks which are completely unrelated to the loop in question. For this reason (and a few others), speed optimization in IDL involves getting rid of for loops and utilizing array operations and some of the various, highly optimized, built in functions in IDL. This is an important consideration because one of the biggest potential bottlenecks for speed in Pikaia is the fitness calculation function supplied by the user. So, this speed-optimized version of pikaia is only as fast as your func. For anyone unfamiliar with vectorization and optimization in IDL, I recommend reading these hints: http://www.dfanning.com/documents/tips.html#IDLWAY

An example

Here's a simple example of putting pikaia to work on a gaussian centered at (.5,.5). In the first example I change the center of the gaussian by passing its center through the _extra keywords. You can download this program here.

function example_calc,n,params,_extra=extra
     ; the fitness function will be a gaussian centered at x=extra.x and y = extra.y
     ; params will be an array of size 2x?
     fitness = exp( -1.0*( (params[0,*]-extra.x)^2 + (params[1,*]-extra.y)^2 ) )

     ; the result has to be transposed because the above expression
     ; returns a column vector, but pikaia expects a row vector
     return,transpose(fitness)
end

function example_calc2,n,params
     ; here's the highly optimized version
     return, exp( -1.0*total( (params-.5)^2, 1 ) )
end

function example_calc3,n,params
     ; here's the highly non-optimized version
     npts = n_elements(params[0,*])
     fitness = fltarr(npts)

     for i=0,npts-1 do fitness[i] = exp( -1.0*( (params[0,i]-.5)^2 + (params[1,i]-.5)^2 ) )
     return,fitness
end

pro pikaia_ex

; the only thing we need to tell pikaia is the function name to use
; and the number of parameters to fit. It will automatically load
; reasonable defaults for ctrl (population size: 100, generations: 200)
n = 2
t1 = systime(/seconds)
pikaia,n,ctrl,x,f,status,fname='example_calc',x=.4,y=.6
print,x

t2 = systime(/seconds)
pikaia,n,ctrl,x,f,status,fname='example_calc2'
print,x

t3 = systime(/seconds)
pikaia,n,ctrl,x,f,status,fname='example_calc3'
print,x

t4=systime(/seconds)
print,t2-t1
print,t3-t2
print,t4-t3
end


The results are:
     0.400000     0.600000
     0.500000     0.500000
     0.500000     0.500000
      0.25064087
      0.22182512
      0.25915694

In this case, the fitness function is simple enough that the time doesn't change all that much between the different results. For comparison, the same calculation with a literal transliteration of pikaia from fortran to IDL takes 4.05 seconds, for a factor of 20 difference.