Interesting performance Consequences of Seq.map

It’s fairly well know that Sequences or “seq”, the short hand for IEnumerable, are lazy. This has some interesting performance consequence I had not considered until recently. When we execute a line like:

let lotsOfInts = Seq.map (fun x -> x + 1) (seq { 1 .. 1000000 })

The command executes almost instantaneously, despite the fact we’re creating a list of a million integers (Real: 00:00:00.001, CPU: 00:00:00.000 on my PC).  This is because everything is lazy; no actual work is done till the list is enumerated. Executing a command like:

let lotsOfInts' = List.of_seq lotsOfInts

Takes a significant amount of time (Real: 00:00:01.107, CPU: 00:00:01.076 for me), because we turn the lazy collection into a concrete collection. This is often pretty much what we want, not to do any work until we need to enumerate the collection. The thing to beware of is that we do this work every time we enumerate the collection. This may be desirable for some times of collection, for example say something that reads from a file of data base that’s like to change between enumerations. But there’s also a class of problem were this is highly undesirable. Consider the following code:

let rec loop seq iteration =

    let timer = new System.Diagnostics.Stopwatch()

    timer.Start()

    let seq' = Seq.map (fun x -> x + 1) seq

    let list = List.of_seq seq'

    printfn "Interation: %i time: %i" iteration timer.ElapsedMilliseconds

    if iteration < 10 then

        loop seq' (iteration + 1)

 

loop (seq { 1 .. 1000000 }) 0

Each iteration will take longer than the last, as each time “List.of_seq seq” is execute the sequence is recursively reiterated. I get the following results on my computer:

Interation: 0 time: 953

Interation: 1 time: 919

Interation: 2 time: 1151

Interation: 3 time: 1857

Interation: 4 time: 1528

Interation: 5 time: 2041

Interation: 6 time: 1820

Interation: 7 time: 2341

Interation: 8 time: 2300

Interation: 9 time: 2747

Interation: 10 time: 2673

 It’s trivial to fix this problem, simply use the “Seq.cache” function:

let rec loop' seq iteration =

    let timer = new System.Diagnostics.Stopwatch()

    timer.Start()

    let seq' = Seq.cache (Seq.map (fun x -> x + 1) seq)

    let list = List.of_seq seq

    printfn "Interation: %i time: %i" iteration timer.ElapsedMilliseconds

    if iteration < 10 then

        loop' (seq' :> seq<_>) (iteration + 1)

 

loop' (seq { 1 .. 1000000 }) 0

Which gives the following results:

Interation: 0 time: 1014

Interation: 1 time: 2337

Interation: 2 time: 2303

Interation: 3 time: 2294

Interation: 4 time: 2506

Interation: 5 time: 2343

Interation: 6 time: 2449

Interation: 7 time: 2075

Interation: 8 time: 2242

Interation: 9 time: 2370

Interation: 10 time: 2107

(Of course in this case it’s probably easier to password the concrete list we’ve already created, but in most places you’ll want to use “Seq.cache”)

Progress on the “DataTools” Project

I note on in a previous blog post that I’ve started creating an open source project for manipulating data. For the moment it’s mainly based around the idea of “Collective Intelligence”, but I hope to draw on influences from other sources as it evolves. I’ve reorganised the source so it’s a bit clearer and added some other concepts, notably a tool for accessing books from “Project Guttenberg” and the work I did on generic algorithms.

For the moment it’s licensed under GPLv2. I’m not a licensing expert but I believe this gives protection to the source while allowing it to remain open. If you need it under different licensing terms, contact me.

 Subscribe in a reader

Links

CVMy CV
stackoverflowMy Stack Overflow CV
Twitter Follow me on Twitter
FaceBook View my Facebook
LinkedIn View my LinkedIn Profile
Viadeo Viadeo Profile (Fran�ais)

Conferences/Workshops

Robert Pickering:Robert Pickering's Beginning F# Workshop,  Robert Pickering's Beginning F# Workshop
2 DAY COURSE. Featuring Robert Pickering
London, Monday, May 10th
Progressive .NET Tutorials, Progressive .NET Tutorials
CONFERENCE (3 DAYS)
London, Wednesday, May 12th BOOK NOW!

Badges


Progressive .NET Tutorials 2009

Disclaimer

The views expressed on this weblog are mine and do not necessarily reflect the views of my employer.

All postings are provided "AS IS" with no warranties, and confer no rights.

www.flickr.com
This is a Flickr badge showing public photos and videos from Robert Pickering. Make your own badge here.