Videos to Watch

So this blog has been quite lately, well it never was the most active in the world. Anyway the good news is there are lots of other good F# resources on the web at the moment. Particually Lang.NET was a great source of F# videos, I was there myself, but I’ve spoke with a few people who were, and indeed there they told me there was lots of interest in F#.

A couple of videos I really liked were:

Luke Hoban on F# Productization: This was an interesting talk, not because it when in depth into F# the language, but because Luke gave a peak behind the scenes of how you run a major F# project - the productization.

http://www.langnetsymposium.com/2009/talks/25-LukeHoban-FSharpProductization.html

Amanda Laucher on F# Concurrency: This a very interesting talk from the charismatic Amanda “Pandamonial” Laucher. Amanda presents a case study of a real world project she’s been working on in the insurance domain. Her mission was to use the concurrency features of F# speed up the companies risk calculations, and the results she achieved are quite amazing. She presents what I see as very compelling reasons for using F#:

http://www.langnetsymposium.com/2009/talks/21-AmanderLauter-FSharpConcurrency.html

Of course I can’t not mention the “Language Oriented Programming in F#” talk by Roger Castillo at Lang.net’s sister event “DslDevCon”. I was briefly involved with the organization of this talk, though to be fair it was Chance Coble and Roger who did the hard work. I was hoping to make it to the event to co-present the talk, but alas, Paris is a long way from Seattle:

http://msdn.microsoft.com/en-us/oslo/dd727739.aspx

Other note worthy videos are Luke Hoban on F#, who makes some very interesting points about how F# programming differs from programming in C#. I like this video because it builds on what Brian McNamara said in this blog post.

Phil Trelford, did a nice session for the skillsmatter.com guys and gals, and also has another planned.

There’s also this series of 4 web casts from Tomáš Petříček, which to be honest I haven’t had chance to watch yet, but knowing Tomas’ work I’m sure they’ll be very high quality.

Past Speaking engagements

Back in February this year I spoke at TechDays Paris 2009, I’ve always enjoyed TechDays Paris so a big thanks to Eric Mittelette and his team for inviting me to speak. Also, many thanks Julien Laugel who help me review the presentation slides and also shared some his experiences using F# with the audience. A screen cast of this presentation is now available. The slides for this presentation are available on the slideshare.net site and the code from the slides is available here, except from the collective intelligence part which is available in its github project.

(aside: I’ve not been actively working on CI project for a while now, but will pick it up sooner or later. I’ve also been talking to Matthew Podywsocki, Joel Pobar, and a few others, who have similar interests in F# and CI, if you’re interested in contributing please contact me)

I was also lucky enough to be invited by Scott Bellware to give a tutorial at the “Progressive .NET Tutorials” at London, in partnership with skillsmatter.com. The podcast of this tutorial is now available on line, but its 3 hours long so not really for one sitting! The code form the tutorials in available here.

I currently have no concrete future speaking engagements, but have a few things in the offing. Feel free to contact me if you’d like me to present or run a tutorial for you.

F# in Beta1 of Visual Studio 2010

F# is now available as part of the beta1 of Visual Studio 2010. I felt a strange sense of pride when I could select F# as the default language in Visual Studio:

Find a summary of where to get it here: http://research.microsoft.com/en-us/um/cambridge/projects/fsharp/release.aspx

“Beginning F#” is “Foundations of F# 2nd Edition”

A few people have noticed that a new book “Beginning F#” by me is available for pre-order on Amazon. I wanted to make it clear that this is the new title for the second edition of “Foundations of F#”. The publisher wanted the title change to make it clear that this was a complementary rather title to “Expert F#” rather than competing with it. It also reflects that during the rewrite that I’m focusing a lot on making the book more accessible to people with no functional programming experience.

Interesting performance Consequences of Seq.map

It’s fairly well know that Sequences or “seq”, the short hand for IEnumerable, are lazy. This has some interesting performance consequence I had not considered until recently. When we execute a line like:

let lotsOfInts = Seq.map (fun x -> x + 1) (seq { 1 .. 1000000 })

The command executes almost instantaneously, despite the fact we’re creating a list of a million integers (Real: 00:00:00.001, CPU: 00:00:00.000 on my PC).  This is because everything is lazy; no actual work is done till the list is enumerated. Executing a command like:

let lotsOfInts' = List.of_seq lotsOfInts

Takes a significant amount of time (Real: 00:00:01.107, CPU: 00:00:01.076 for me), because we turn the lazy collection into a concrete collection. This is often pretty much what we want, not to do any work until we need to enumerate the collection. The thing to beware of is that we do this work every time we enumerate the collection. This may be desirable for some times of collection, for example say something that reads from a file of data base that’s like to change between enumerations. But there’s also a class of problem were this is highly undesirable. Consider the following code:

let rec loop seq iteration =

    let timer = new System.Diagnostics.Stopwatch()

    timer.Start()

    let seq' = Seq.map (fun x -> x + 1) seq

    let list = List.of_seq seq'

    printfn "Interation: %i time: %i" iteration timer.ElapsedMilliseconds

    if iteration < 10 then

        loop seq' (iteration + 1)

 

loop (seq { 1 .. 1000000 }) 0

Each iteration will take longer than the last, as each time “List.of_seq seq” is execute the sequence is recursively reiterated. I get the following results on my computer:

Interation: 0 time: 953

Interation: 1 time: 919

Interation: 2 time: 1151

Interation: 3 time: 1857

Interation: 4 time: 1528

Interation: 5 time: 2041

Interation: 6 time: 1820

Interation: 7 time: 2341

Interation: 8 time: 2300

Interation: 9 time: 2747

Interation: 10 time: 2673

 It’s trivial to fix this problem, simply use the “Seq.cache” function:

let rec loop' seq iteration =

    let timer = new System.Diagnostics.Stopwatch()

    timer.Start()

    let seq' = Seq.cache (Seq.map (fun x -> x + 1) seq)

    let list = List.of_seq seq

    printfn "Interation: %i time: %i" iteration timer.ElapsedMilliseconds

    if iteration < 10 then

        loop' (seq' :> seq<_>) (iteration + 1)

 

loop' (seq { 1 .. 1000000 }) 0

Which gives the following results:

Interation: 0 time: 1014

Interation: 1 time: 2337

Interation: 2 time: 2303

Interation: 3 time: 2294

Interation: 4 time: 2506

Interation: 5 time: 2343

Interation: 6 time: 2449

Interation: 7 time: 2075

Interation: 8 time: 2242

Interation: 9 time: 2370

Interation: 10 time: 2107

(Of course in this case it’s probably easier to password the concrete list we’ve already created, but in most places you’ll want to use “Seq.cache”)

Progress on the “DataTools” Project

I note on in a previous blog post that I’ve started creating an open source project for manipulating data. For the moment it’s mainly based around the idea of “Collective Intelligence”, but I hope to draw on influences from other sources as it evolves. I’ve reorganised the source so it’s a bit clearer and added some other concepts, notably a tool for accessing books from “Project Guttenberg” and the work I did on generic algorithms.

For the moment it’s licensed under GPLv2. I’m not a licensing expert but I believe this gives protection to the source while allowing it to remain open. If you need it under different licensing terms, contact me.

Collective Intelligence and the Guardian Data-Store

I’ve been interested in collective intelligence and machine learning for a while now. These too related fields centre round using statistical tools on large sets of data to make measurements and predictions. So when the UK’s Guardian newspaper announced their “Data-store”, a collection of data set open to the public I felt it was time to apply some of what I’ve learned to the data they were offer.

I choose to apply hierarchal clustering to the data on world health. The idea of hierarchal clustering is to measure how similar data sets are then pair off the similar data sets to build a binary tree that will relieve groups of similar data. I used the pearson correlation to compare the data sets and the resulting data is drawn in a dendrogram, a way of showing the distances between the various clusters that emerge from our clustering algorithm.

The code I’ve used is available on github.com, it’s packaged in an F# project called gdata.fsproj. For a direct link to the project click here. (There’s also a demonstration on hierarchal clustering with word counts from blogs from TechDays Paris 2009 talk).

Anyway, I’m not going to dig too deeply into the code, at least for this post, so let’s have a look at the results. First I clustered by county using the following statics to form my vectors:

Hospital beds per 1000
Nursing and Midwifery Personnel per 1000
One-year-olds Immunised with diphtheriatetanustoxoidandpertussisdtp
One-year-olds Immunised with hepatitis b
One-year-olds Immunised with hibhib3vaccine
Adolescent fertility rate (%)
Births attended by skilled health personnel (%)
Infant mortality rate (per 1 000 live births) both sexes
Maternal mortality ratio (per 100 000 live births)
Neonatal mortality rate (per 1 000 live births)
Life expectancy at birth (years) both sexes
Life expectancy at birth (years) female
Life expectancy at birth (years) male
Deaths among children under five years of age due to HIV/AIDS (%)
Per capita recorded alcohol consumption (litres of pure alcohol) among adults
Population with sustainable access to improved drinking water sources (%) total
Population with sustainable access to improved sanitation (%) total.

The statistics were chosen mainly because they were the most complete; it is only possible to compare countries using this technique if all statistics are available. The resulting dendrogram can be seen below:

 

There’s no great surprises from the stats, there appears to be two distinct clusters, one of poor countries towards the bottom of the diagram and one of richer countries towards the top, with the 1st world countries being located towards the top of this cluster (absolute position doesn’t matter much is the diagram it’s more who your close to). There are perhaps a few surpises, maybe we wouldn’t have expected to find Cananda quite so close to the Ukraine or perhaps not the Czech Republic so closed to Germany. It may be worth going back to the underlying statistics to find why this is.

Perhaps a more interesting analysis is to reverse the matrix so we are no comparing which conditions are related to each other:

Again, the diagram does show some obvious relations. Male and female life expectancies were always going to statically similar to overall life expectancy, but it does appear that this is closely related to infant mortality rates. In turn is closely correlated to births attend by medical professions and access to clean water and sanitations. While this is fairly logical I think it’s good that we can show, statically speaking at least, that access to clean water and sanitation will improve infant mortality rates and life expectancy.

While these first steps in analysing the Guardian Data didn’t perhaps turn up anything we didn’t already know, I feel it’s shown that if you spend a bit of time working with public available data you can start to find interesting patterns. I shall definitely be looking at how I can further these experiments.

Feature Speaking Engagement – F# Tutorial at the Progressive .NET Tutorials, May 11-13th, London

I will be giving a half day F# tutorial at the “Progressive .NET Tutorials” organised by Skills Matter. This will be an excellent 3 daylong event with 2 tracks featuring half day and full day Tutorials by Gojko Adzic, David Laribee, Hammet, Ian Cooper, Mike Hadlow, Scott Belware and Sebastien Lambla.

My will be giving a half day tutorial on Wednesday May 13th (the last day of the event). I will be presenting 'F# Tutorial ', which will aim to give delegates the building blocks for using F# productively and to start having fun with it.

For the full programme and description of my tutorial, and all other Progressive .NET tutorials, check out: http://progressive-dotnet.com

Special Community Discount: Book on or before March 31st and pay just £500!

Skills Matter has given me a promotion code that will entitle you to a substantial discount off the Tutorial Fees. Simply book on or before March 31st, quote SM1368-622459-33L (in the Promo Code field) and pay just £500 (normal price £1000). Offer is valid until March 31st only, and tickets are going fast, if you would like to secure a place and claim your discount – you’d better get a wriggle on.

The code to use is: *SM1368-622459-33L* and must be entered in the box provided when booking online at https://skillsmatter.com/register-online/conf/280

Full details of the event can be found at http://progressive-dotnet.com

ALTi

(Sorry I’ve been a bit quiet recently; this is the first of several posts I’ll be making this morning)

I decided a little while ago that I’d like to change direction in my career and go back to consulting, after interviewing around a bit I decided to join ALTi. It was my first day on Monday, and so far I’m enjoying my first week, although obviously I’m just getting settled in. The thing I like most about the company so far is that they seem quite open to suggestions and seem will to let you develop your career in the direction you want. I’ll be predominately working on .NET projects in there .NET practice, so I’m interested in finding any projects with an F# slant out there (although F# won’t be my exclusive focus). I’m also hoping to develop the training and speaking side of my career. So if you have some F# work or are interested in having an F# presentation or tutorial, do not hesitate to drop me a line: Printf.sprintf "%s@%s.%s" "robert" "strangelights" "com"

Foundations of F# - Second Edition

I’m very pleased to announce I’ve started working on a second edition of Foundations of F#. The aim is to document the language in the form it will be in when it is released in with Visual Studio 2010. This will be a challenge since it’s not yet known exactly what features will be in the final release.

So, why a second edition? It’s true that the original book is only about 2 years old, but that’s quite a long time in the IT industry these days and much has change in the F# landscape since the original was wrote and indeed much in the .NET platform has evolved. When the original was written F# was still primarily a research project, entirely driven by Microsoft Research and had a relatively small user community and evener fewer commercial users. Now F# is being co-developed by Microsoft Corporate in Seattle and Microsoft Research in Cambridge into a product. While the community remains relative small compared to that of C++, C# and Java the community site http://cs.hubfs.net has seen grown enormously since the first editions releases and question regularly pop up on http://stackoverflow.com too. The language has also attracted some major commercial users, Credit Suisse recently announced they indented to develop their quantitative models that are deployed on the .NET platform in F#, other commercial users include Coherent PDF and flying frog consultancy. Also my own opinions about programming have evolved over the past two years; I’m now a professional functional programmer and so have a much greater experience about functional programming and want to share that with you.

What are my aims for the second edition? I believe the first edition was generally well received I have certainly received a lot of mail form reads and the vast majority of it has been very positive. The reviews on amazon.com are mostly positive too; however there are some criticisms and I indent to address these in the second edition. Name the code will be more accurate, and it will be focused on functional style and give better advice about building applications in F#. The errors in the code generally arouse form the fact the code was written Visual Studio then manually copied and pasted into word, making it difficult to recheck the codes accuracy as version of F# changes, also let to the temptation to make small edits of the code without checking them. To address the code quality issue: all code is now insert via a script and all the code samples will be available on codeplex, meaning there more easily available anyone interested can check them and send me comments. The codeplex project is: http://www.codeplex.com/FOFS. To address functional programming issues I’ll be doing a major rewrite of the functional programming issue to put more emphasis on good functional style and I’ll also be adding a chapter called “Anatomy of an F# Application” to try and give some guidance on how to build an application in F#. I also want to document all the new features of F#, Active Patterns, Workflows and Units of Measure.

Concretely how will the book change? I’m aiming for evolution rather than revolution, each chapter will be carefully reviewed and updated, and there will be quite a lot of new material too. As I’ve said the chapter that will see the biggest changes is chapter 3, most of the original examples in this chapter will probably stay but this will be better organised to give more emphasis on functional style and there’ll be some new material too. There’ll be three new chapters, “Concurrency”, “Parsing Text” and “Anatomy of an F# Application”.  I didn’t tackle “Concurrency” in the first edition because, to be honest, I didn’t know that much about concurrency, also I felt F# didn’t offer that many advantages in the concurrent programming. In the past two years I’ve learnt a lot more about concurrency (although there’s still more to know), F# has new and interesting features for tackling concurrency and it’s now a more important topic that ever, so it’ll definitely be in this book. The parsing “Parsing Text” chapter is be added because I want to grow the “Language Oriented Programming” chapter, I thought the original chapter was good but I want to put more emphasis on using the technique to tackle real world problems, so the chapter will grow and the sections concerned purely with parsing text will be split off. I also want to cover techniques other than fslex/fsyacc for parsing text such as fsparses and mgrammar.  “Anatomy of an F# Application” will address the need for more advice on how to build complete applications in F#.

Well that’s it for now. Feel free to send comments and suggestion to the usual address and check out the examples at codeplex: http://www.codeplex.com/FOFS

Links

 Subscribe in a reader
Twitter Follow me on Twitter
FaceBook View my Facebook
LinkedIn View my LinkedIn Profile Viadeo Viadeo Profile (Français)

Badges


Progressive .NET Tutorials 2009

Disclaimer

The views expressed on this weblog are mine and do not necessarily reflect the views of my employer.

All postings are provided "AS IS" with no warranties, and confer no rights.

www.flickr.com
This is a Flickr badge showing public photos and videos from Robert Pickering. Make your own badge here.