If you take a look under the hood at the new System.Query name space you will see that the core of it is located in the “Sequence” class and this class bears an amazing resemblance to the F# List library located in the MLLib.dll. Okay you have to use your imagination a tiny bit, but if you consider that:
System.Collections.Generic.IEnumberable<T>()
Is really not that different from:
Microsoft.FSharp.List<A>
Then you start to see that:
val map: ('a -> 'b) -> 'a list -> 'b list // F# signature
public static List<B> map<A, B>(FastFunc<A, B> f, List<A> x); // C# signature
Looks a lot like:
public static IEnumerable<S> Select<T, S>(IEnumerable<T> source, Func<T, S> selector);
You can also see:
val filter: ('a -> bool) -> 'a list -> 'a list // F# signature
public static List<A> filter<A>(FastFunc<A, bool> f, List<A> x); // C# signature
Looks a lot like:
public static IEnumerable<T> Where<T>(IEnumerable<T> source, Func<T, bool> predicate);
I could go on. So why should an F# programmer be interested in this library, if they already have all this functionality available in the MLLib.dll library? There are a couple of reasons; firstly the System.Query adds a few new methods that are missing from the F# list library such as Sum, Min, Max, Average and TypeOf, while each of these functions would be relatively easy to put together in F#, its nice to have them there, tested and ready to go. Perhaps more importantly the is the sear number classes in Framework 2.0 that implement the IEnumberable<T>() interface, nearly every collection implements it. This means just about every collection in the .Net framework is now open to this style of functional programming using System.Query, no need to convert it to a native F# type.
To investigate this style of functional programming I decided to implement the cute example of querying the methods on the System.String object that Anders Hejlsberg demonstrates in this video. For those that haven’t seen the video the objective is to display the of each instance members of String once along with the number of overloads.
First we define some F# friendly ways to access the LINQ methods. We have a choice here; we can either go for operators or functions.
Operators are an interesting choice as the operands appear on either side of the operator, allowing us to have one set of data we are working with easily apply many operations to it. Operator overloading looks like:
let (||) s f = Sequence.Select(s, new Func<_,_>(f))
Functions are perhaps a safer choice as they give us a much better description of what’s going on. Defining a function to work with LINQ looks like:
let select s f = Sequence.Select(s, new Func<_,_>(f))
The operator version of the query looks like:
let namesByOperator =
(methods
|? (fun m -> not m.IsStatic) // where
> (fun m -> m.Name) // group by
|| (fun m -> m.Key, ! m.Group)) // select and count
< (fun (_, m) -> m) // order by
It is nice and brief, but its not too clear what’s going on, especially if the comments where omitted. Also the ordering of the brackets looks a bit random because the operators all have different presence the brackets are necessary sometimes to clarify execution order, but not others.
The method version looks like:
let select f s = Sequence.Select(s, new Func<_,_>(f))
let namesByFunction =
(orderBy
(select
(groupBy
(where methods (fun m -> not m.IsStatic))
(fun m -> m.Name))
(fun m -> m.Key, count m.Group))
(fun (_, m) -> m))
This is not so great either, because of the order the functions take there parameters the lambdas that define the functions behaviour get further from the function they relate to. We can help make things clearer with a bit of code formatting, but this is still not ideal. We could also improve things by defining at intermediate for all these operations, but then we’d have to think up 3 new names, which is also not ideal.
Another option would to be reorder the functions parameters. This is shown below:
let namesByFunction2 =
(orderBy2 (fun (_, m) -> m)
(select2 (fun (m : Grouping<_,_>) -> m.Key, count m.Group)
(groupBy2 (fun (m : MethodInfo) -> m.Name)
(where2 (fun (m : MethodInfo) -> not m.IsStatic) methods)
)
)
)
For me the readability of the code is improved by the lambda definitions being close to the functions they are defined on, however if you look at the sample you notice that now most of the lambdas have been explicitly typed. This because F#’s type inferences works left to right so it can no longer infer the type of the lambdas from the collections they are to be used with.
Now I think we start to see why the VB.NET and C# teams introduced their new query syntax, it looks like it’s an aid to help the programmer to stop queries becoming a mass of randomly ordered lambdas.
Although F# is unlikely to go the root of introducing query syntax, they are planning to look at how LINQ integrates with the language so it will be interesting to see what they come up with.
The full source for this demo is available here.
Thanks to Don Syme and Erik Meijer, who both helped me with details of this post.