It’s fairly well know that Sequences or “seq”, the short hand for IEnumerable, are lazy. This has some interesting performance consequence I had not considered until recently. When we execute a line like:

let lotsOfInts = Seq.map (fun x -> x + 1) (seq { 1 .. 1000000 })

The command executes almost instantaneously, despite the fact we’re creating a list of a million integers (Real: 00:00:00.001, CPU: 00:00:00.000 on my PC).  This is because everything is lazy; no actual work is done till the list is enumerated. Executing a command like:

let lotsOfInts' = List.of_seq lotsOfInts

Takes a significant amount of time (Real: 00:00:01.107, CPU: 00:00:01.076 for me), because we turn the lazy collection into a concrete collection. This is often pretty much what we want, not to do any work until we need to enumerate the collection. The thing to beware of is that we do this work every time we enumerate the collection. This may be desirable for some times of collection, for example say something that reads from a file of data base that’s like to change between enumerations. But there’s also a class of problem were this is highly undesirable. Consider the following code:

let rec loop seq iteration =

    let timer = new System.Diagnostics.Stopwatch()

    timer.Start()

    let seq' = Seq.map (fun x -> x + 1) seq

    let list = List.of_seq seq'

    printfn "Interation: %i time: %i" iteration timer.ElapsedMilliseconds

    if iteration < 10 then

        loop seq' (iteration + 1)

 

loop (seq { 1 .. 1000000 }) 0

Each iteration will take longer than the last, as each time “List.of_seq seq” is execute the sequence is recursively reiterated. I get the following results on my computer:

Interation: 0 time: 953

Interation: 1 time: 919

Interation: 2 time: 1151

Interation: 3 time: 1857

Interation: 4 time: 1528

Interation: 5 time: 2041

Interation: 6 time: 1820

Interation: 7 time: 2341

Interation: 8 time: 2300

Interation: 9 time: 2747

Interation: 10 time: 2673

 It’s trivial to fix this problem, simply use the “Seq.cache” function:

let rec loop' seq iteration =

    let timer = new System.Diagnostics.Stopwatch()

    timer.Start()

    let seq' = Seq.cache (Seq.map (fun x -> x + 1) seq)

    let list = List.of_seq seq

    printfn "Interation: %i time: %i" iteration timer.ElapsedMilliseconds

    if iteration < 10 then

        loop' (seq' :> seq<_>) (iteration + 1)

 

loop' (seq { 1 .. 1000000 }) 0

Which gives the following results:

Interation: 0 time: 1014

Interation: 1 time: 2337

Interation: 2 time: 2303

Interation: 3 time: 2294

Interation: 4 time: 2506

Interation: 5 time: 2343

Interation: 6 time: 2449

Interation: 7 time: 2075

Interation: 8 time: 2242

Interation: 9 time: 2370

Interation: 10 time: 2107

(Of course in this case it’s probably easier to password the concrete list we’ve already created, but in most places you’ll want to use “Seq.cache”)

Feedback:

Feedback was imported from my only blog engine, it's no longer possible to post feedback here.

re: Interesting performance Consequences of Seq.map - kot

typo?
"Executing a command like:

<< what? >>

Takes a...."

re: Interesting performance Consequences of Seq.map - Robert Pickering

Yes, rather unfortunate typo (well copy & paste error), should have read:

let lotsOfInts' = List.of_seq lotsOfInts

Fixed now.

re: Interesting performance Consequences of Seq.map - Amanda Laucher

I think another copy paste error. Typing blogs can suck :) (Blog comments also!)

let rec loop' seq iteration =
let timer = new System.Diagnostics.Stopwatch()
timer.Start()
let seq' = Seq.cache (Seq.map (fun x -> x + 1) seq)
let list = List.of_seq seq'
printfn "Interation: %i time: %i" iteration timer.ElapsedMilliseconds
if iteration < 10 then
loop' seq' (iteration + 1)

loop' (seq { 1 .. 1000000 }) 0

re: Interesting performance Consequences of Seq.map - Robert Pickering

Thanks for the correction Amanda, glad some ones paying attention! Add the missing prime in line now.