A quick and dirty implementation of Excel NORMINV in F#

A couple of weeks ago I posted an example implementation of Excel NORMINV function in C#, in there I mentioned that what I actually needed was an F# version, but I used C# as an stepping stone moving from the C++ version I originally found. Well, here you have my attempt at implementing NORMINV in F#:

let probit mu sigma p =
    if p < 0. || p > 1. then failwith "The probability p must be greater than 0 and lower than 1"
    elif sigma < 0. then failwith "The standard deviation sigma must be positive"
    elif sigma = 0. then mu
    else
        let q = p - 0.5
        let value =
            if abs q <= 0.425 then          // 0.075 <= p <= 0.925
                let r = 0.180625 - q * q
                let a0, a1, a2, a3, a4, a5, a6, a7 =
                    3.387132872796366608, 133.14166789178437745, 1971.5909503065514427, 13731.693765509461125,
                    45921.953931549871457, 67265.770927008700853, 33430.575583588128105, 2509.0809287301226727
                let b0, b1, b2, b3, b4, b5, b6, b7 =
                    1., 42.313330701600911252, 687.1870074920579083, 5394.1960214247511077,
                    21213.794301586595867, 39307.89580009271061, 28729.085735721942674, 5226.495278852854561
                q*(((((((a7*r + a6)*r + a5)*r + a4)*r + a3)*r + a2)*r + a1)*r + a0)
                    / (((((((b7*r + b6)*r + b5)*r + b4)*r + b3)*r + b2)*r + b1)*r + b0)
            else                            // closer than 0.075 from {0,1} boundary
                let r = sqrt -(log (if q > 0. then 1. - p else p))
                let val1 =
                    if r <= 5. then         // <==> min(p,1-p) >= exp(-25) ~= 1.3888e-11
                let a0, a1, a2, a3, a4, a5, a6, a7 =
                     1.42343711074968357734, 4.6303378461565452959, 5.7694972214606914055, 3.64784832476320460504,
                     1.27045825245236838258, 0.24178072517745061177, 0.0227238449892691845833, 7.7454501427834140764e-4
                 let b0, b1, b2, b3, b4, b5, b6, b7 =
                     1., 2.05319162663775882187, 1.6763848301838038494, 0.68976733498510000455,
                    0.14810397642748007459, 0.0151986665636164571966, 5.475938084995344946e-4, 1.05075007164441684324e-9
                        let r1 = r - 1.6
                        (((((((a7*r1 + a6)*r1 + a5)*r1 + a4)*r1 + a3)*r1 + a2)*r1 + a1)*r1 + a0)
                            / (((((((b7*r1 + b6)*r1 + b5)*r1 + b4)*r1 + b3)*r1 + b2)*r1 + b1)*r1 + b0)
                    else                    // very close to  0 or 1
                        let a0, a1, a2, a3, a4, a5, a6, a7 =
                            6.6579046435011037772, 5.4637849111641143699, 1.7848265399172913358, 0.29656057182850489123,
                            0.026532189526576123093, 0.0012426609473880784386, 2.71155556874348757815e-5, 2.01033439929228813265e-7
                        let b0, b1, b2, b3, b4, b5, b6, b7 =
                            1., 0.59983220655588793769, 0.13692988092273580531, 0.0148753612908506148525,
                            7.868691311456132591e-4, 1.8463183175100546818e-5, 1.4215117583164458887e-7, 2.04426310338993978564e-15
                        let r1 = r - 5.
                        (((((((a7*r1 + a6)*r1 + a5)*r1 + a4)*r1 + a3)*r1 + a2)*r1 + a1)*r1 + a0)
                            / (((((((b7*r1 + b6)*r1 + b5)*r1 + b4)*r1 + b3)*r1 + b2)*r1 + b1)*r1 + b0)
                if q < 0. then -val1 else val1
        sigma*value + mu

I wouldn’t get angry if some of you think the code is ugly, I’m afraid such is the way of evaluating several polynomials (each pair of polynomials carefully tuned for a specific segment) and then dividing them. At any rate, I invite you to compare this version with the C# version (let alone the C++ version), I hope you find the F# implementation terser. You can download a Visual Studio 2010 sample project here:

Please remember I have made only a handful of tests, so I encourage you to do (a lot of) additional testing if you plan to use this function for anything even slightly serious.

F#, the ACM, and the SEC

It all started with a twit from @mulambda: “Phil Wadler lists #fsharp as a candidate for SEC regulation spec language: http://tinyurl.com/2edfxka” I downloaded the, nonetheless, Association for Computer Machinery answer to the Securities and Exchange Commission proposal (and ask for comments) on requiring Python programs to be provided to explain contractual cash flow provisions. I quickly skimmed the ACM document and twitted “Java, C#, and F# recommended by the #ACM for SEC regulation spec language http://is.gd/e49Vt #fsharp /via @mulambda”. Later, I read with more care the ACM answer and I found that I really should clarify my twit:

  1. First of all, in no place the ACM explicitly recommends Java, C# or F# (sorry for my over-enthusiastic extrapolation Ruborizado)
  2. What the document does do, is to put Java, C#, and F# in a (very?) positive light at several places:
    1. “Safe execution of code written by one party on a machine owned by a different party was not a strong concern in the design of Python. It was a strong concern in the design of other systems, including Java and the .Net framework (which supports multiple languages, including C#, F#, and Iron Python)” Page 4. Sonrisa
    2. “The widely accessible interpreted implementations of Python and Perl execute more slowly than compiled languages such as Java, C#, F#, and Scheme by as much as one or two orders of magnitude” Page 4. Sonrisa
    3. “Python has a specification written in English, but its specification has not received the same careful attention as those of Java or C#. Page 5
    4. “Java, C#, and F# are designed to be executed efficiently, Python and Perl implementations are significantly less efficient” Page 6 Sonrisa
    5. Statically typed languages are generally considered to produce more reliable and easier to maintain code, while dynamically typed languages are generally considered to produce more flexible
      code and to be better suited for prototyping. Java, C#, F# are statically typed; Python and Perl are dynamically typed” Page 6 The bolding is mine, this one is specially controversial but I for one (more like 1 cent) think it’s a good point.
    6. “Some programming languages have been designed with security in mind, and some of their implementations include “sandboxes” that can securely execute untrusted code. Java, C#, and F# are such languages; Python and Perl are not” Page 6 Sonrisa
    7. “Java and C# have had significant effort put into producing precise specifications; F#, Python, and Perl less so. SML and Scheme possess full formal mathematical specifications; F# is descended from SML”
  3. What the ACM does explicitly recommend (page 1):
    1. “Other programming languages may be more appropriate than Python”
    2. “Security should be enhanced by executing the code in a "virtual sandbox." ”
    3. “The SEC should consider developing a domain-specific programming language or library for writing waterfall programs.”

And it is this last recommendation on DSLs that carries interesting news for F# because at the end of page 6 it states “Experience seems to show that higher-order programming languages such as F# provide a particularly good basis for domain-specific languages. There are financial domain-specific languages available in F#” ¡Chócala!. And for an encouraging ending in page 7, answering the question “Is it appropriate to require issuers to submit the waterfall computer program in a single programming language, such as Python, to give investors the benefit of a standardized process?”, the answer includes “Other choices, such as F#, may be more appropriate than Python” Sonrisa.

So, I think I should really replace my original twit with something like this:

Java, C#, and F# mentioned by the ACM as possibly better alternatives for the SEC regulation spec language. Furthermore, the ACM recommends the creation of a DSL and mentions F# as a good candidate for doing so. Sonrisa

Finally, let me clarify that I have nothing against Python, on the contrary I like most of (the little I know of) it, and I’m fully aware of the myriads of working solutions written in Python, so more power to the Python people! It’s only that I think that, looking forward -particularly in the financial industry, we are better served by functional languages like F#.

A quick and dirty implementation of Excel NORMINV function in C#

We are piloting the implementation of some financial risk models in F#, it so happens that the models are already implemented in Excel, so I was slowly digging out the formulas in the cells and translating them to F#. Everything was going fine until I found out that some formulas used the NORMINV function which doesn't exist in the .NET libraries. I started to look for F#, and then C#, implementations without luck (as we are just in the lets-see-if-this-have-any-chance-of-flying stage, we can’t afford any of the excellent but paid numerical libraries for .NET). The closest thing I found was a C++ implementation. The code looked really weird to me (my fault, not the coder's), so I decided to do the translation in two steps: first from C++ to C#, then on to F#. The C# translation seems to be working now, and you can download it from SkyDrive:

Please be aware that:

  1. I am not an expert in statistics by any stretch of imagination
  2. Ditto for numerical methods
  3. I have made only a handful of very basic tests

Having said that, the function *seems* to be working so I hope it will help somebody Sonrisa.

Posted by Edgar Sánchez with 15 comment(s)
Filed under: , ,

Scrum vs. CMMI Level 3

Of late, I have been helping start a Microsoft SDL implementation effort and, as part of it, it comes the decision of what flavor of MSF we should use: Agile (Scrum nowadays) or CMMI (roughly Level 3 with the Team Foundation Server template). Now, this is a corporate customer, expecting to have budgets and schedules defined in order to green light any sizeable project, so we naturally lean to CMMI but I can’t help remembering all the formal methodology implementation efforts I’ve seen (and sometimes helped Ruborizado) fail (RUP, anyone?). So, after a few years, I am reading about the subject again, and in Chapter 7 “Effective Change Leadership for Process Improvement” of Michael West’s Real Process Improvement Using the CMMI (ISBN 0849321093), I find these pearls of wisdom on the behavior of top management:

You are not exempt from the history or the statistically documented behavior patterns of your peer executives, you are not an exception, so here is what you’re going to say to the organization when it comes to CMMI or process improvement:

  • We must get to CMMI Level (pick your number, 1 to 5) to remain competitive.

  • I want everyone to support the CMMI effort.

  • Your job (or promotion or performance review or raise, you choose) depends on your contribution to CMMI (or getting the maturity level).

  • The long-term viability of our enterprise relies on our process capability, not on our individual heroics.

  • These are our processes and I expect all of you to follow them.

  • CMMI is really important to all of us and our future.

  • I expect all of you to embrace change and do things differently.

Those are the things you’ll say. Here is what you’re going to do or not do :

  • When push comes to shove, when you are forced to choose between heroics and process to get the product out the door, you will choose heroics.

  • You will reward the heroes, the people who work all night or all weekend to solve a customer problem. You will not reward the silent, humble engineer who puts quality first and prevents problems from ever getting to the customer.

  • You will ask for status of the CMMI effort in terms of CMMI compliance. You will not ask for measures that indicate improvements in productivity, quality, cycle time, employee satisfaction, customer satisfaction, organizational learning, or other measures of operational excellence.

  • You will not ask the process people to give you estimates for effort, cost, and schedule to achieve a CMMI maturity level. You will give them a target date that is tied to your bonus.

  • You will tell the people who report to you that you want their support of the CMMI effort. You will not give them any incentive, positive or negative, for that support.

  • You will not bother to learn the organizational processes yourself; they are for everyone else.

  • You will not personally exhibit the change in behaviors you expect from others.

Alas, as so many who have gone before you, you will fail. Oh, you will get your maturity level/bonus/promotion/raise/praise/plaque on the wall, but make no mistake, you will have failed in leading your organization to change and grow.

So exactly and painfully true! I really don’t have an answer for the customer yet, but I will certainly read as much (and as fast) as I can of Michael book.

Posted by Edgar Sánchez with 1 comment(s)

Visual Studio 2010 and .NET Framework Beta 2 available on Wednesday

This blog has been abandoned for the longest time :-$ but I’ve got great news to try and re-inaugurate it (again): it’s just been announced that beta 2 for Visual Studio 2010 and .NET Framework 4 will be available the day after tomorrow, i.e. on October 21st; moreover, we now have a firm date for the launch of the final versions of these products: March 22nd 2010. There is a lot of cool stuff in the new versions of Visual Studio and .NET Framework but my personal favorites (at least for the time being :-) are:

  1. The inclusion of dynamic programming elements in C# and other framework languages. Good things from languages like Python, Groovy, or Ruby are now an integral part of C#.
  2. The inclusion of F# as a first level language of the .NET Framework, welcome functional programming to the most popular commercial development platform on the planet!
  3. A new version of Entity Framework, making the oficial .NET ORM tool more flexible and adequate. Less and less, we’ll need to code SQL by hand and laboriously move data from tables to objects and viceversa.

As I said, there’s a lot of cool stuff in Visual Studio 2010 and .NET Framework 4, but I plan to start using those three things as soon as possible in customer projects, after all, to have success stories by March 2010, we have to start working before this year ends.

Distance between adjacent points in F#

Let’s say you are given a list of data points:

[7;5;12;8;5]

And you are asked to find the distance between every adjacent pair, that is:

[(7-5);(5-12);(12-8);(8-5)] = [2;-7;4;3]

It turns out that there is an elegant solution to this problem:

let rec pairs = function
  | h1 :: (h2 :: _ as tail) -> (h1, h2) :: pairs tail
  | _ –> []


let distances dataPoints =
  dataPoints |> pairs |> List.map (fun (a, b) -> a - b)

The magic happens in the pairs function, this function takes [7;5;12;8;5] and turns it into [(7,5);(5,12);(12,8);(8,5)], that is, it creates a tuple with every member of the list and its right neighbor. The trick is that tail gets bounded  to ( h2 :: _ ), so that the recursive call processes the list starting with h2. This is called a named subpattern, something I discovered in section 1.4.2.2 of F# for Scientists. You learn at least a nice thing every day!

Posted by Edgar Sánchez with 5 comment(s)

Entity Framework, LINQ to SQL and Oracle

Amid the debate about which is better and have more future (two things that not necessarily go together) between LINQ to SQL and Entity Framework, one thing they have in common is the fact that Oracle is in “no comment” mode about both of them. It’s like Oracle would be expecting that the lack of its “official” provider for Entity Framework, let alone LINQ to SQL, would somehow move people to develop in Java instead of .NET Framework. IMHO, Visual Studio 2008 is so productive that people may first consider moving from Oracle to SQL Server before moving from VS 2008 to JDeveloper.

 

 image

Luckily, .NET Framework has a big ecosystem of developers and ISV’s: enter Devart, a software house in Russia or Ukraine –I’m not sure. They’ve been offering for a while now an Entity Framework provider for Oracle, I have had the chance to use it with Oracle 10g with good success. The good news is that a few days ago they released new versions of all of its providers (changing their names while at it), including dotConnect for Oracle 5.00. Even more intriguing, this new version includes a LINQ to SQL provider for Oracle, something supposedly so complex to do that it would have taken a long time before it even existed. To be fair, I haven’t already used this last provider, but the very fact that it’s available is exciting. Now Oracle friendly Visual Studio 2008 developers (no, that’s not an oxymoron at all) has two good paths to follow. Let the debate begin!

Free F# libraries (well, almost)

In what was one of the very last PDC2008 sessions, Luca Bolognese did an encore presentation of F#, instead of trying to tell you what it was all about I invite you to watch the video (Luca is engaging and funny, and the session is so packed with information that one our will pass in no time). What I wanted to do is to talk about a couple of very interesting libraries, all written in F#, that Luca used in his demos:

F# for Numerics offers a bunch of numerical analysis functions, things like matrix operations, integration and differentiation, statistical methods, maximization and minimization, Fourier transforms, you know, the stuff we all love about maths [:P].

F# for Visualization allows us to visualize functions in 2D as well as 3D, including animations and PNG export. Believe me, the graphs really look good.

If you are hesitant about this, Jon Harrop, the man behind the libraries is offering free licenses of both libraries, well, with the usual banners and watermarks reminding you that you should really buy the real thing. Not that they are expensive either: you can get both by around US$ 100.

Personally, I feel a really sweet smell from the very fact that these libraries exist: a great symptom of a language or technology readiness for the market is that libraries from third parties start to appear (as, for example, has happened in the last months with WPF, but that’s the matter for another blog entry…)

While we are on the subject, and for those of you who are really intrigued by scientic applications, I strongly suggest you to take a look to F# for Scientists, the book where Jon tells us how to use F# in this field.

Posted by Edgar Sánchez with 1 comment(s)

Point distance, imperative vs. functional style

Let’s consider a silly simple algebra problem: given a specific point and a set of several other points, find the closest point in the set to the given point. One C# solution is:

    1     static class PointMath

    2     {

    3         static double Distance(Point p1, Point p2)

    4         {

    5             double xDist = p1.X - p2.X;

    6             double yDist = p1.Y - p2.Y;

    7             return Math.Sqrt(xDist * xDist + yDist * yDist);

    8         }

    9 

   10         public static Point ClosestPoint(Point p, IList<Point> points)

   11         {

   12             double shortestDistance = Double.PositiveInfinity;

   13             Point closestPoint = null;

   14 

   15             foreach (var point in points)

   16             {

   17                 double distance = Distance(p, point);

   18                 if (distance < shortestDistance)

   19                 {

   20                     shortestDistance = distance;

   21                     closestPoint = point;

   22                 }

   23             }

   24 

   25             return closestPoint;

   26         }

   27     }

The Distance() function finds the distance between two points with good ol’ Pythagoras, the ClosestPoint() function does the classic loop: traverse the points list and calculate every distance, if you find a smaller one, keep it and the also keep the current point, at the end return the last point you kept. Easy, but with declarations, curly braces and whatnot, the solution takes 27 lines, OK, 14 lines if we ignore the blank lines and the curly-braces-only lines. And we didn’t even show the Point class definition… Can we do any better?

What about this F# solution:

    1 let distance (x1, y1) (x2, y2) : double =

    2   let xDistance = x1 - x2

    3   let yDistance = y1 - y2

    4   sqrt (xDistance * xDistance + yDistance * yDistance)

    5 

    6 let closestPoint toPoint fromPoints = List.min_by (distance toPoint) fromPoints

 

The distance function is almost a clone of its C# cousin, the sexy one is the closestPoint function: just one line! Let’s try to dissect it a little bit:

  1. The List.min_by function expects two parameters: the last one is the list from where the minimum will be picked, the first one is the function that will be used to compare the items
  2. As I said, distance expects two parameters, but we are providing just one (toPoint), what are we accomplishing with this? Well, to begin with, the type of the distance function is (my apologies for the relaxed syntax) Point x Point –> double (i.e. distance takes two points and returns one double). If we fix one of those parameters (for example saying distance (0.5, 0.5) ), what we are actually doing is to define a new function of type Point –> double, this new function only knows how to calculate distances to the point (0.5, 0.5). In our case, in line 6 we have created a function that knows how to calculate the distances to toPoint
  3. So, List_min_by finds the point in fromPoint closest to toPoint, calculate every distance to toPoint and keeping the fromPoints member with the shortest distance

I can hear some of you complaining about the fact that closestPoint may have taken just one line of code, but it took seven lines of explanation, but this is mainly because we are not used to the language, once you have familiarity with F#, the meaning of line 6 comes quite naturally. Any small imperative problem that you would like to see solved in a functional style?

Posted by Edgar Sánchez with 14 comment(s)

The first Visual F# CTP is here!

You leave on vacation for one short week and a lot happens... for example, Don Syme & co. have released the first F# CTP, well on the way (hopefully before this year's end) to put F# on the same level as C#, C++ or VB.NET. As far as I know, this will be an historical event: for the first time a mainstream platform (commercial or otherwise) wholly adopts a functional language. Allow me to seize the occasion to reiterate that there are several reasons for the functional programming paradigm to be considered interesting important, IMHO the most relevant are:

  1. The usage of high level functions and lazy evaluation allows us to reach higher levels of abstraction and modularization, which eases the programming of the ever more complex problems that we face.
  2. The extensive use of immutable values and structures greatly paves the way to code execution parallelization, which is especially relevant in today's multi-core CPU world.
  3. Functional thinking adapts particularly well to the solution of math problems (be them symbolic or numeric). Well, this last point may not have as broad reach as the former, but it is especially fascinating and useful for mathematicians and scientists (which in more than a few cases have had to stick to FORTRAN or use very specialized products like Matlab or Mathematica).

For reasons like these, functional programming has steadily invaded the programming scenario, for example:

  1. C# 3.0 has lambda and higher level functions (or a subset, at least, sort of...) and lazy evaluation (LINQ anyone?)
  2. In the Java world, people is starting to consider Scala, a functional language for the JVM
  3. There's renewed interest in specialty languages like Erlang (used at Ericsson to forge extremely scalable and reliable systems)
  4. F# will be a first class citizen in the .NET Framework world

An example of the F# September CTP running on my Visual Studio 2008 SP1:

FSharpSeptemberCTP

If anybody is wondering what this code does, fibonacciSequence generates the Fibonacci numbers up to a given maximum, and the second function adds the even terms of the sequence up to a given limit. It's a succinct solution for Problem 2 @ Project Euler (and it would be interesting for you to try and solve it in your favorite language). It's all quechua to you? Well, that's just a matter of getting to know the language :-), IMHO the best way to learn F# is following the Expert F# book, co-authored by the language's father himself. Furthermore, Microsoft has put online the language official site, the F# Developer Center, where you will find several other resources.

By the way, it seems like the official name of the release will be Visual F# 1.0 (which actually corresponds to Version 2, counting from its inception at Microsoft Research). Finally, Visual F# requires only .NET 2.0, and an intriguing consequence of this is that you will be able to run F# code on Linux, thanks to the Mono  project, that is, you will have an open source functional programming platform, courtesy from Microsoft (and Novell).

More Posts Next page »