It’s fairly well know that Sequences or “seq”, the short hand for IEnumerable, are lazy. This has some interesting performance consequence I had not considered until recently. When we execute a line like:
let lotsOfInts = Seq.map (fun x -> x + 1) (seq { 1 .. 1000000 })
The command executes almost instantaneously, despite the fact we’re creating a list of a million integers (Real: 00:00:00.001, CPU: 00:00:00.000 on my PC). This is because everything is lazy; no actual work is done till the list is enumerated. Executing a command like:
let lotsOfInts' = List.of_seq lotsOfInts
Takes a significant amount of time (Real: 00:00:01.107, CPU: 00:00:01.076 for me), because we turn the lazy collection into a concrete collection. This is often pretty much what we want, not to do any work until we need to enumerate the collection. The thing to beware of is that we do this work every time we enumerate the collection. This may be desirable for some times of collection, for example say something that reads from a file of data base that’s like to change between enumerations. But there’s also a class of problem were this is highly undesirable. Consider the following code:
let rec loop seq iteration =
let timer = new System.Diagnostics.Stopwatch()
timer.Start()
let seq' = Seq.map (fun x -> x + 1) seq
let list = List.of_seq seq'
printfn "Interation: %i time: %i" iteration timer.ElapsedMilliseconds
if iteration < 10 then
loop seq' (iteration + 1)
loop (seq { 1 .. 1000000 }) 0
Each iteration will take longer than the last, as each time “List.of_seq seq” is execute the sequence is recursively reiterated. I get the following results on my computer:
Interation: 0 time: 953
Interation: 1 time: 919
Interation: 2 time: 1151
Interation: 3 time: 1857
Interation: 4 time: 1528
Interation: 5 time: 2041
Interation: 6 time: 1820
Interation: 7 time: 2341
Interation: 8 time: 2300
Interation: 9 time: 2747
Interation: 10 time: 2673
It’s trivial to fix this problem, simply use the “Seq.cache” function:
let rec loop' seq iteration =
let timer = new System.Diagnostics.Stopwatch()
timer.Start()
let seq' = Seq.cache (Seq.map (fun x -> x + 1) seq)
let list = List.of_seq seq
printfn "Interation: %i time: %i" iteration timer.ElapsedMilliseconds
if iteration < 10 then
loop' (seq' :> seq<_>) (iteration + 1)
loop' (seq { 1 .. 1000000 }) 0
Which gives the following results:
Interation: 0 time: 1014
Interation: 1 time: 2337
Interation: 2 time: 2303
Interation: 3 time: 2294
Interation: 4 time: 2506
Interation: 5 time: 2343
Interation: 6 time: 2449
Interation: 7 time: 2075
Interation: 8 time: 2242
Interation: 9 time: 2370
Interation: 10 time: 2107
(Of course in this case it’s probably easier to password the concrete list we’ve already created, but in most places you’ll want to use “Seq.cache”)
Feedback:
Feedback was imported from my only blog engine, it's no longer possible to post feedback here.
re: Interesting performance Consequences of Seq.map - kot
typo?
"Executing a command like:
<< what? >>
Takes a...."
re: Interesting performance Consequences of Seq.map - Robert Pickering
Yes, rather unfortunate typo (well copy & paste error), should have read:
let lotsOfInts' = List.of_seq lotsOfInts
Fixed now.
re: Interesting performance Consequences of Seq.map - Amanda Laucher
I think another copy paste error. Typing blogs can suck :) (Blog comments also!)
let rec loop' seq iteration =
let timer = new System.Diagnostics.Stopwatch()
timer.Start()
let seq' = Seq.cache (Seq.map (fun x -> x + 1) seq)
let list = List.of_seq seq'
printfn "Interation: %i time: %i" iteration timer.ElapsedMilliseconds
if iteration < 10 then
loop' seq' (iteration + 1)
loop' (seq { 1 .. 1000000 }) 0
re: Interesting performance Consequences of Seq.map - Robert Pickering
Thanks for the correction Amanda, glad some ones paying attention! Add the missing prime in line now.