As mentioned in my previous post, I’ve been reading “Collective Intelligence” by Toby Segaram and I’m really enjoying in it. It’s different to a lot of programming books, in that rather than focusing a specific language or API it focus on a particular set of problems and shows techniques that can be used to crack them.
For example I’ve known for a long time that it’s easy to create an abstract syntax tree (AST) in F#:
/// Untyped expression tree
type Expression =
| Multiply of Expression * Expression
| Add of Expression * Expression
| Subtract of Expression * Expression
| GreaterThan of Expression * Expression
| If of Expression * Expression * Expression
| Constant of int
| Parameter of int
Here we present a simple AST for representing numeric expressions, while it’s easy to provide a function to evaluate such a tree:
/// Given a list of parameters evaluate the tree
let evaluateExpression parameters =
let rec innerEval tree =
match tree with
| Multiply (x, y) -> innerEval x * innerEval y
| Add (x, y) -> innerEval x + innerEval y
| Subtract (x, y) -> innerEval x - innerEval y
| GreaterThan (x, y) -> if innerEval x > innerEval y then 0 else 1
| If (pred, tval, fval) -> if innerEval pred = 0 then innerEval fval else innerEval tval
| Constant value -> value
| Parameter pos -> List.nth parameters pos
innerEval
I’ve often had problems coming up with nice simple examples of what you might want to do with such an expression. Well “Collective Intelligence” gave me a great idea of what you can do with such an expression: genetic programming. The example we’re going to look at is presented in chapter 11, pages 251 to 268. The idea is straight forward if you can abstractly represent a program as we can with our numeric expression AST then it’s possible to make random changes to our program. Wants we have made the random changes we can measure which programs are successful and use the successful ones to create a new generation of programs. So what kind of problems can we solve with this approach? Well here’s a nice one, suppose we have a set of points:
X Y Result
36 38 1485
17 13 371
13 2 217
24 31 715
13 0 213
38 3 1569
10 11 157
11 32 223
18 25 433
And we want to find out what function fits best to these points, we can use genetic programming to solve this problem. I’ve given a program below that can solve this problem though genetic programming. I don’t want to go into too many details, you should really buy the book if you want a good explanation, but in a nutshell the algorithm works in the following way:
- We create a bunch of random function
- We test how close these get to the predefined points
- If we find a function that fits we return it
- Otherwise we breed the most successful function and loop to the testing step
Anyway the full implementation is here, and version can be downloaded here [EDIT: link fixed]:
#light
open System
open Microsoft.FSharp.Math
/// Untyped expression tree
type Expression =
| Multiply of Expression * Expression
| Add of Expression * Expression
| Subtract of Expression * Expression
| GreaterThan of Expression * Expression
| If of Expression * Expression * Expression
| Constant of int
| Parameter of int
/// Given a list of parameters evaluate the tree
let evaluateExpression parameters =
let rec innerEval tree =
match tree with
| Multiply (x, y -> innerEval x * innerEval y
| Add (x, y -> innerEval x + innerEval y
| Subtract (x, y -> innerEval x - innerEval y
| GreaterThan (x, y -> if innerEval x > innerEval y then 0 else 1
| If (pred, tval, fval -> if innerEval pred = 0 then innerEval fval else innerEval tval
| Constant value -> value
| Parameter pos -> List.nth parameters pos
innerEval
let simplifyExpression =
let rec innerEval tree =
match tree with
| Multiply (Constant 0, _ -> Constant (0
| Multiply (_, Constant 0 -> Constant (0
| Multiply (Constant x, Constant y -> Constant (x * y
| Add (Constant x, Constant y -> Constant (x + y
| Subtract (Constant x, Constant y -> Constant (x - y
| GreaterThan (Constant x, Constant y -> if x > y then Constant 0 else Constant 1
| If (Constant pred, tval, fval -> if pred = 0 then fval else tval
| x -> x
let rec loop tree =
let tree' = innerEval tree
if tree' = tree then
tree
else
loop tree
loop
/// print the expression to the console
let printExpression =
let rec innerPrint ppf tree =
match tree with
| Multiply (x, y -> Printf.fprintf ppf "(%a * %a" innerPrint x innerPrint y
| Add (x, y -> Printf.fprintf ppf "(%a + %a" innerPrint x innerPrint y
| Subtract (x, y -> Printf.fprintf ppf "(%a - %a" innerPrint x innerPrint y
| GreaterThan (x, y -> Printf.fprintf ppf "(%a > %a" innerPrint x innerPrint y
| If (pred, tval, fval -> Printf.fprintf ppf "(if %a then %a else %a" innerPrint pred innerPrint fval innerPrint tval
| Constant value -> Printf.fprintf ppf "%i" value
| Parameter pos -> Printf.fprintf ppf "p%i" pos
innerPrint System.Console.Out
let rand = new Random(
/// build a random expression with limited depth, a maximum constants value,
/// and a limited number of parameters
let buildRandomExpression maxDepth maxConst noParams =
let rec innerBuild curDepth =
if curDepth < maxDepth then
let nextDepth = curDepth + 1
match rand.Next(7 with
| 0 -> Multiply (innerBuild nextDepth, innerBuild nextDepth
| 1 -> Add (innerBuild nextDepth, innerBuild nextDepth
| 2 -> Subtract (innerBuild nextDepth, innerBuild nextDepth
| 3 -> GreaterThan (innerBuild nextDepth, innerBuild nextDepth
| 4 -> If (innerBuild nextDepth, innerBuild nextDepth, innerBuild nextDepth
| 5 -> Constant (rand.Next(maxConst
| 6 -> Parameter (rand.Next(noParams
| _ -> failwith "assert false"
else