Settings

Theme

A Story to Explain Genetic Algorithms

jake.simvla.com

51 points by jakeburtn 13 years ago · 34 comments

Reader

jacques_chester 13 years ago

If you're interested in the world of "computational intelligence" aka "nature-inspired computation", this book is a good high level survey: http://www.cleveralgorithms.com/

(You can also find stuff under the heading of "metaheuristics" -- http://cs.gmu.edu/~sean/book/metaheuristics/).

GAs are one of four (!) different independently developed strands of thought -- Genetic Algorithms, Genetic Programming, Evolutionary Programming and Evolutionary Strategies. Look out for that when you hit the Googletrons.

  • jdherman 13 years ago

    Those are some good links, thanks. Are there clear distinctions between these four areas? It seems like the separation is a historical artifact, when they're all basically doing the same thing.

    • jacques_chester 13 years ago

      I mostly looked at GAs and GPs. Evo strategy has some interesting mathematical stuff.

      Basically GA and GP are the dominant strands. And there's a key difference.

      Genetic Algorithms are historically about finding paramaters into a function. You have some function f(X1 ... Xn), and you are trying to find the best set of parameters X1...Xn. So a common representation is a string of floats or integers or whatever, and then you do the crossovers and mutations based on positions in the string.

      Genetic Programming is about creating new functions. So instead of a string of parameters, the genome is (usually) a tree of instructions to be interpreted. You do mutations by swapping node types, crossovers by chopping subtrees and moving them etc.

      One artefact of the historical "parallel evolution" (teehee) of these fields is that GP practitioners use crossovers much less than GA practitioners.

    • randomsearch 13 years ago

      To a large degree you are correct, the different algorithms are often interchangeable. Sometimes this is not the case: e.g. Genetic Programming is used to search for algorithms that are not easily represented and searched for with other evolutionary algorithms.

stcredzero 13 years ago

> Then our Charles simply had to figure out how Mrs Kipling scored the cakes and he could genetically evolve the best cake!

"And then a miracle happens." Isn't the fitness function the "key ingredient?" (It's in the source code, but not in the text for a reason.)

Still a good explanation of the rest.

helloiamdave 13 years ago

Once upon there was another local search algorithm. It was called GA. The end.

  • randomsearch 13 years ago

    Actually, GAs are mostly described as global search algorithms, because they use populations with recombination and therefore simultaneously consider potentially disparate areas of the search space.

  • fmax30 13 years ago

    Once upon a time there was a local search algorithm It was called GA and then it mutated and became a Global Search Algorithm. The End.

jmduke 13 years ago

Unrelated: this simvla network thing looks like a pale imitation of Svbtle.

  • FaisalAbid 13 years ago

    @OP- Hey Jake, awesome article, really understand what Genetic Algorithms are now.

    Also wanted to let the people that think Simvla is a ripoff of Svbtle know that I recently acquired Simvla from its creator and it's my first job to give Simvla a unique look. I respect Svbtle and I feel it's unfair to Dustin that Simvla copies the look he created.

ipince 13 years ago

I would suggest making the fitness function more complex. I think its a bit confusing to see the optimal result be equal to the fitness function's parameters, and it makes it easier to miss the point.

perone 13 years ago

I've written something [1] related to this if someone is interested.

[1] http://pyevolve.sourceforge.net/wordpress/?p=576

jmcdowell 13 years ago

I don't have much of a knowledge of genetic algorithms so it might be good to explicitly say what part of GA algorithms set them apart from other branches of algorithms.

Nice story otherwise.

gweinberg 13 years ago

It's not at all clear from the story why or if having multiple recipes "mate" works better than just using one recipe and fiddling with it.

  • jules 13 years ago

    It doesn't. At least there is zero evidence that it does and there is a lot of evidence that it does not (i.e. simple hill climbing beats GA in every practical situation). I've mentioned this before on HN and reddit, but from time to time these kind of stories come up again. There is a weird fascination with genetic algorithms. Even though dozens of people have a career based on it, as far as I can tell the whole field of biologically inspired optimization algorithms is basically pseudoscience. Note optimization algorithms, so this does certainly not include e.g. neural networks, which work fantastically well.

  • randomsearch 13 years ago

    That's a very insightful comment. The debate over whether recombination gives you anything "extra" continues in academia to this day. For example, papers on the controversy surrounding crossover in Genetic Programming (Luke and Spector in the 90s is a good starting point).

datz 13 years ago

This is stupid. The whole is not simply a sum of its parts in cake baking and biology.

zapdrive 13 years ago

"After thinking long and hard about the problem at hand our Charles came to the conclusion that he knew two things"

Lol... long and hard.

youngerdryas 13 years ago

Neat little story, despite the grammar and capitalization. Is it common to use _ as an index variable?

Edit: for _ in range(4):

  • JosephHatfield 13 years ago

    Not sure how common that is, but I've seen it used before where the index variable itself is not referenced inside the body of the statement.

  • aghull 13 years ago

    Common in Lua, Ruby, Python. In fact, ruby gives _ a minor amount of special meaning to make it more convenient as a throw-away variable, e.g. it can be specified multiple times in an argument list without an error and generates no warning if left unused.

  • monochromatic 13 years ago

    I've seen it before, but it's a little jarring. Not sure what's wrong with just using i.

    • jdiez17 13 years ago

      It says explicitly that you don't care about that value. If you used i, for example, you wouldn't know whether the loop depended on the iteration number.

    • randomsearch 13 years ago

      To avoid unused variable warnings.

  • yen223 13 years ago

    It's better to use two underscores __ to represent throwaway variables, because _ has a meaning in Python: it returns the last output.

    • GuiA 13 years ago

      True, but only in the REPL.

      For example try to run the following script (not from the REPL):

        _ = 12
        print _
        42 * 42
        print _
  • mertd 13 years ago

    Most of the code is a little verbose for Python. For example make() function could simply be:

        return [random.randint(1,10000) for x in range(4)]
    
    Also remember that a little kitten dies every time you write a for loop as follows:

        for index in range(len(some_list)):
    • ipince 13 years ago

      I'm a python noob. Why does a kitten die? You're creating a new list of the same size of the source list (once) and iterating over it. How can you achieve the same w/out creating a new list?

      • mertd 13 years ago

        Python god takes the kitten as a sacrifice for the unnecessary list of indexes you created :).

        range(len(some_list)) is a convoluted way iterating over the elements of a list. What you are doing is creating a new list with elements [0,... lengh-1] and iterating over the elements of this list, only to immediately throw them away after using them as array index to grab the item you really want from the original list.

        Instead you can just directly iterate over your original list: "for item in some_list". If you really need to know the index value, you can use enumerate: "for index, item in enumerate(some_list)"

        One nice thing is that the "for x in y" lets you iterate over things (iterables) that you may be generating on the fly (e.g., all prime numbers smaller than N).

  • mikeash 13 years ago

    _ is a typical name to use when the language requires a variable name that the program doesn't otherwise need. In this case, the program just needs to run some code four times, and doesn't care about the index. The language doesn't have a plain "run this code X times" construct, but it does have a "iterate over this list" construct, so the code iterates over the list [0, 1, 2, 3] and puts the current element into a dummy variable since it's not being used.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection