Rico made some more interesting posts in his parser performance quiz, first producing a parser where the predicates are compiled into an in memory representation and also a parser where he produces IL on the fly. He talks about them here and here respectively.
So I ported both of these parsers to F#; as they demonstrate some nice uses of patterns commonly used in ML, such as take one data structure that is easy to read and transforming it into another than is easy to evaluate. This gives us 6 parsers in total, so wrote little test harness so I could hold a great parser shoot out. I had to alter Rico’s parsers slightly to fit them into my test harness, but the sprit of the code remains very much the same. All this code is can be downloaded here. I want to take a slightly more holistic view of things, so I’m going to take a look at three areas, the size, structure and modularity of the code base, the scalability of the parsers, and their speed.
Size, Structure and Modularity of Code Base
Here are the size related statistics:
Rico’s Parsers
Original 255 lines
Compiling 308 lines
Jitting 273 lines
Rob’s Parsers
Original 56 lines (org.fs 5 lines, lexorg.fs 20 lines, parsorg.fs 31 lines)
Compiling 114 lines (comp.fs 49 lines, lexats.fs 19 lines, parsorg.fs 25 lines, ats.fs 6 lines, facts.fs 15 lines)
Jitting 111 lines (comp.fs 43 lines, lexats.fs 19 lines, parsorg.fs 25 lines, ats.fs 6 lines, facts.fs 15 lines)
In terms of size code base, F# is a clear winner, with a line count saving of between 3-5 times (although admittedly I could do with put a few more comments in my code, but I wouldn’t expect this to have much impact on the overall results).
Structure and modularity is somewhat harder to measure, clearly the F# parsers come in more files, but that’s not necessarily a good thing. What I like about the F# approach is that it allows a programmer to analyse the different stages of parsing separately. The breaking the parser into tokens is taken care of by the lexer, the grammar is defined in the .fy file and abstract syntax tree (ATS) is built from the grammar, leaving the programmer free to think about an abstract representation of the grammar. Also F# discriminating union type is great for making abstract representations of grammar,
Rico’s parser makes no attempt to handle precedence or error recovery, while mine do not do this either, it would be considerable simpler to handle this in a yacc style parser.
It might also be appropriate to note in this section that I did find a bug in Rico’s code. His jitting parser does not correctly handle the not operator (!). But I probably shouldn’t go not about this two much, as a) this is because he used the IL Not op code incorrect, so is a micro bug, not much to do with the programs overall structure, and b) one of the parsers in my original post was pretty buggy. For those of you interested the IL Not op code should only be used for bit wise nots; for operations on Booleans you should load 0 on to the stack then execute a ceq op code (or at least this is the approach the C# compiler takes).
Scalability
In this context I’m defining scalability as the size of the input the parser can handle. Rico’s compiling parser doesn’t do very well here, as it is based on a fix size stack, so the original parser can only input strings with less than 16 facts. I change the stack size to 64 so it could take part in more of my test, but it is very satisfactory having to do this.
Both the C# parsers stack overflow at 20384 facts, which the F# ones can handle okay. I can go on to do further test as at 40768 facts my testing infrastructure stack overflows.
Speed
The first test I had five predicated and each predicate being evaluated 100 000 times. This test should give us an idea of the performance of the evaluation of the predicate, as the compilation of the predicate is a one off cost and spread over 100 000 repetitions should be insignificant.
Number of repetitions: 100000
Number of predicates per repetitions: 5
1 2 4 8 16 32 64
Rico - Original Parser 120 605 1466 3755 7914 15779 32067
Rico - Compiling Parser 129 187 301 559 1064 1936 3768
Rico - Jitting Parser 139 185 328 466 828 1472 2779
Rob - Original Parser 6831 8688 22434 28433 59721 112887 219203
Rob - Compiling Parser 256 353 783 1192 2303 4190 7935
Rob - Jitting Parser 166 204 343 454 909 1493 2821
The jitting parsers are almost neck and neck. This is some what unsurprising as the actual evaluation of predicates contains nearly exactly the same IL code. While Rico’s jitting parser out performs mine, my does sneak a head in a couple of places and the margins are so small it could swing the other way if you ran the test again.
My compiling parser performs a disappointing 2 times slow than Rico’s. I suspect the main causes of this are that I use a growable evaluation stack, use a new instance of the stack for each evaluation, and also each evaluation use a method call to a lambda function. Calling a lambda function in F# is no slower that calling any other CLR method, it is still slow than no method calls at all. I think I could tune away most of these difference, but I won’t for now as these things all offer advantages too, like improved scalability and thread safety.
The second test I ran I used 500 predicates, and evaluated each one twenty times. This is allows us to look at how each parser performs when some more of the parse time is taken into account.
Number of repetitions: 20
Number of predicates per repetitions: 500
1 2 4 8 16 32 64
Rico - Original Parser 8 13 41 75 166 325 727
Rico - Compiling Parser 10 4 9 17 39 60 158
Rico - Jitting Parser 15 142 298 534 1042 2148 4070
Rob - Original Parser 174 197 348 626 1181 2377 4674
Rob - Compiling Parser 27 20 33 59 111 211 434
Rob - Jitting Parser 13 158 323 593 1101 2314 4318
As noted in Rico’s final article, the jitting parsers do not do well here, since the cost of JIT compiler are too high if the method is going to be invoked a small number of times. The F# compiling parser continues to do well reasonably well despite its higher parsing cost, performing between 3 and 4 than the C# compiler. The F# generated parser is slower partly because parsers are not typically used on such small input strings – large stacks are allocated for each of the 1million+ invocations to the parser. It also converts the Unicode strings to ASCII bytes for compatibility with OCaml, which a future distribution of F# will come with both an ASCII and Unicode fslex/fsyacc, so will not have to do this conversion. It will be interesting to rerun these test when this is available and see how much overhead this was adding.
Conclusion
Obviously I’m a bit of an F# enthusiast, but I have tried to present all the pros and cons of each parser here in a fair way. Okay the F# parsers don’t quite cut the mustard in terms of raw speed; they do offer many other advantages such as better scalability and smaller and more understandable code base. I am convinced you could write a parser that would go as fast as the C# one, but I haven’t done as it would mean copying the C# programming style and losing many of the advantages we gain for F#.
I hope you have enjoyed the article and that if you need to write a parser in a .NET language you’ll give F# ago. The code for all the parsers is available here, along with a spread sheet and some charts of the results.