Recalculating Values Only When Dependencies Change

I had an interesting exchange of emails with a reader recently. They wanted to be able to execute a function and cache the result and revaluate the function only when one of the values in depends on changes. It’s a common enough problem I guess, but one that’s surprisingly tricky to get right. This is because when you evaluate a function it’s difficult to know what it depends on and therefore difficult to know when the results of the evaluation become dirty.

But between we came up with two different approaches to this problem. The first approach was to reuse idea of a lazy value, something already available in F# libraries but extend it to provide a reset function, which marks the value as dirty and forces it to be recalculated, we called this class ResetableLazy. We then couple this with UpdatableValue class a value that raises a notification each time that its value is changed. We then use these two types to create a function createDependentCal, which takes a list of UpdatableValue and associates it with a ResetableLazy calculation, that way each time one of the values in the list is changed the calculation is reset.

#light

open Microsoft.FSharp.Control.LazyStatus

 

/// a lazy value that can be reset to force it be recalculated

type ResetableLazy<'a> =

    { mutable status : LazyStatus<'a>; initFunction : unit -> 'a }

 

type ResetableLazy<'a> with

    static member Create(f) : ResetableLazy<'a> = {status=(LazyStatus.Delayed f); initFunction = f}

    member x.IsForced = match x.status with LazyStatus.Value _ -> true | _ -> false

    member x.IsDelayed = match x.status with LazyStatus.Delayed _ -> true | _ -> false

    member x.IsException = match x.status with LazyStatus.Exception _ -> true | _ -> false

    member x.Reset() = x.status <- LazyStatus.Delayed x.initFunction

    member x.Force() =

        match x.status with

        | LazyStatus.Delayed f ->

            x.status <- Exception Undefined;

            try let res = f () in x.status <- LazyStatus.Value res; res

            with e -> x.status <- LazyStatus.Exception(e); rethrow()

        | LazyStatus.Value x -> x

        | LazyStatus.Exception e -> raise e

    member x.Value = x.Force()

 

/// a value that raise a notification when it is changed

type UpdatableValue<'a>(origVal : 'a) = class

    let mutable currentVal = origVal

    let changedTrigger, changedEvent = IEvent.create()   

    member x.Changed

        with get() = changedEvent

    member x.Value

        with get() = currentVal

        and set newVal =

            currentVal <- newVal

            changedTrigger()

end

 

/// a function that assoicated a list of UpdatableValue with a function

/// that that will be recalculated only when members of the list change

let createDependentCal (dependencies : UpdatableValue<_> list)  f =

    let cal = ResetableLazy.Create(f)

    dependencies |> List.iter (fun x -> x.Changed.Add(fun () -> cal.Reset()) )

    cal

 

// the tests ...

 

let v1 = new UpdatableValue<int>(1)

let v2 = new UpdatableValue<int>(2)

 

let add10 =

    createDependentCal

        [v1]

        (fun () ->

            printfn "Executing: v1.Value + 10"

            v1.Value + 10)

let subV1V2 =

    createDependentCal

        [v1; v2]

        (fun () ->

           printfn "Executing: v1.Value - v2.Value"

           v1.Value - v2.Value)

   

let test() =

    for i = 0 to 2 do

      Printf.printfn "iteration %d : result=%d" i (add10.Force())

      Printf.printfn "iteration %d : result=%d" i (subV1V2.Force())

 

test()

Printf.printf "\nChanging v1 ...\n\n"

 

v1.Value <-  10

test()

 

Printf.printf "\nChanging v2 ...\n\n"

v2.Value <-  5

test()

 

read_line()

The obvious limitation to this approach is the list of values must be explicitly associated with the calculation, so it’s possible that programmer could make a mistake and incorrectly use a value that had not been listed as depended.

The second approach removes this limitation by automatically calculating dependencies. It does this because as each item, a value or a calculation, is fetched by a value property. As it is being fetched it pushes itself on to a stack so we can always find which calculation is calling us, allowing us to automatically build up a dependencies list.

#light

 

open System

open System.Collections.Generic

 

/// A node in an evalution tree that can either be a caculation

/// or a value

type Node<'a> =

| Value of 'a

| Calculation of (unit -> 'a)

 

/// A wrapper that allows the values of calculated nodes to be cached

type NodeWrapper<'a> = class

    val stack: Stack<NodeWrapper<'a>>

    val mutable node: Node<'a>

    val mutable calculatedValue: option<'a>

    val depencies: ResizeArray<NodeWrapper<'a>>

    new (stack: Stack<NodeWrapper<'a>>, node: Node<'a>) =

        { stack = stack;

          node = node;

          calculatedValue = None;

          depencies = new ResizeArray<NodeWrapper<'a>>() }

    member x.Value

        with get() =

            // add the caller to the depencies list if one exists

            if x.stack.Count > 0 then

                let caller = x.stack.Peek()

                if not (x.depencies.Exists(fun x -> x = caller)) then

                    x.depencies.Add(caller)

                     

            match x.calculatedValue with

            | None ->

                // push ourselves on to the caller stack so next user knows who we are

                x.stack.Push(x)

                // Evaluate ourself.

                let res = match x.node with Calculation f -> f() | Value x -> x

                // remove ourselves from the call stack

                x.stack.Pop() |> ignore

                // update cache

                x.calculatedValue <- Some(res)

                res

            | Some x -> x

                   

        and set(v) =

            // update the node if its a value otherwise fail

            match v with

            | Value _ ->

                x.node <- v

                x.calculatedValue <- None

                x.depencies |> Seq.iter(fun x -> x.MakeDirty() )

            | Calculation _ -> failwith "can not reset the value of a calculation"

    member x.IsDirty

        with get() =

            match x.calculatedValue with None -> true | _ -> false

    member x.MakeDirty()=

        // set us and our dependcies to dirty

        x.calculatedValue <- None

        x.depencies |> Seq.iter(fun x -> x.MakeDirty() )

 

end

 

/// any node produced by this the same instance of this class will

/// automatically be related

type StateWrapper<'a>() = class

    let stack = new Stack<NodeWrapper<'a>>()

    member x.NewNode(n) =

        new NodeWrapper<'a>(stack, n)

end

 

// the tests ...

 

let stateWrapper = new StateWrapper<int>()

 

let v1 = stateWrapper.NewNode(Value 1)

let v2 = stateWrapper.NewNode(Value 2)

let v3 = stateWrapper.NewNode(Value 3)

let v4 = stateWrapper.NewNode(Value 4)

 

let c1 = stateWrapper.NewNode(Calculation (fun() -> Console.WriteLine "Calculating C1";  v1.Value + v2.Value))

let c2 = stateWrapper.NewNode(Calculation (fun() -> Console.WriteLine "Calculating C2";  v3.Value + v4.Value))

let c3 = stateWrapper.NewNode(Calculation (fun() -> Console.WriteLine "Calculating C3";  c1.Value + c2.Value))

 

Console.WriteLine c1.Value

Console.WriteLine c3.Value

Console.WriteLine c3.Value

Console.WriteLine c3.Value

// Change values and verify.

v1.Value <- Value 10

v2.Value <- Value 20

v3.Value <- Value 30

v4.Value <- Value 40

Console.WriteLine c1.Value

Console.WriteLine c3.Value

Console.WriteLine c3.Value

Console.WriteLine c3.Value

// Set the value of function node, only c3 should recompute.

v3.Value <- Value 666

Console.WriteLine c1.Value

Console.WriteLine c3.Value

Console.WriteLine c3.Value

// Set an input of that function, check that c2, c3 recompute.

v3.Value <- Value 300

Console.WriteLine c3.Value

Console.WriteLine c3.Value

 

We haven’t yet done anyway where near enough testing to root out any nasty corner cases, but I think both techniques show promise.

There are several obvious problems to these approaches but most of these are probably acceptable to some degree. One problem is that neither approach works well with multithreading. If a value is updated by one thread while a calculation is being preformed then a calculation will become dirty and the thread performing the calculation won’t know until it’s too late. However we could probably live with this by group sets of related calculations on a single thread and running other unrelated calculation sets on different threads.

Another problem is that both approaches add quite a bit of overhead to fetching values, so calculations must be sufficiently complicated to justify this over head, but there are probably many situations where this overhead is justifiable, particularly in the financially sector where I work. To ensure this approach is right we really need to design a test that uses a series of increasingly complicated calculations and varying rates of data change then use the two frameworks along with a control group to check there respective performance. Definitely an interesting study but for another blog post ...

 

Bookmark
dotnetkicks+, digg+, reddit+, del.icio.us+, dzone+, facebook+

Print | posted @ Wednesday, August 22, 2007 10:14 AM

Comments on this entry:

Gravatar # re: Recalculating Values Only When Dependencies Change
by Geoffrey Washburn at 10/8/2007 8:08 PM

If you are not already aware of it, you may want to look at Umut Acar's work on selective memoization and self-adjusting computation (http://ttic.uchicago.edu/~umut/papers/index.html).

Your comment:

(Note: all comments are moderated so it may take sometime to appear)

Title:
Name:
Email:
Website:
 
Italic Underline Blockquote Hyperlink
 
 
Please add 4 and 3 and type the answer here: