Friday 30 March 2007

Achieving Closure

I was having a chat over lunch a few days ago about lambda abstractions. The guy I was talking to didn't fully understand what they did and how they worked, but he's a smart guy so he understood it pretty quickly. I explained them using the below example, and I thought I'd have a shot at writing this up for other people to read.

Say you have the following functions in your code somewhere. The code is intentionally simple. If that's a problem, imagine that everywhere you see double, it actually reads encrypt.

int double(int val);

void doubleList(int values[], int valCount)
{
  for (int i = 0; i < valCount; ++i)
    values[i] = double(values[i]);
}

int double(int val)
{
  return val * 2;
}

Now, it's pretty inconvenient to have to declare and define a whole other function for something as short as double. Excuse me while I invent some syntax, but wouldn't it be convenient to be able to do something like:

void doubleList(int values[], int valCount)
{
  for (int i = 0; i < valCount; ++i)
    values[i] = function(int val) { return val * 2; }(values[i]);
}

And that bit with function... is a lambda abstraction or closure. What's happened here is that I've defined and called a function all on the one line. This function doesn't have a name, so it can't be called from anywhere else. But it is otherwise just like a normal function; it appears on the call stack, it can return values and it can even be passed parameters. Which is how values[i] ends up in the val local variable.

In fact the only unusual thing about this function is that it was defined 'inline.' And perhaps the syntax, but I invented that on the spot, so don't hold that against it. There are nicer ways to express this.

This sort of thing is known in some programming languages as a lambda abstraction. That name comes from something called the 'lambda calculus,' which doesn't really matter, but briefly it's a way of writing programs as a series of anonymous functions, each referred to as a lambda.

But why is it also called a closure? That's much more interesting because that name actually says something about what you can do with it.

Technically, it's called a closure because the anonymous function 'closes over' objects within the containing scope.

Which means almost nothing. Looking back at that inline doubling example above it did appear slightly unwieldy. Something like this is more readable:

void doubleList(int values[], int valCount)
{
  for (int i = 0; i < valCount; ++i)
    values[i] = function() { return values[i] * 2; }();
}

Notice now how the anonymous function no longer takes any arguments. How can that still work? The anonymous function is a separate function on the callstack, with its own local variables. How can it access the values array, or the counter i, for that matter?

Imagine the following code:

class DoubleFunction
{
  public:
    DoubleFunction(int values[], int valCount, int i)
    {
      mValues = values;
      mValCount = valCount;
      mI = i;
    }

    int apply()
    {
      return mValCount[mI] * 2;
    }

  private:
    int mValues[];
    int mValCount;
    int mI;
};

void doubleList(int values[], int valCount)
{
  for (int i = 0; i < valCount; ++i)
  {
    DoubleFunction func(values, valCount, i);

    values[i] = func.apply();
  }
}

So, what's happening here? We've defined a class with a constructor and private members that happen to match exactly all the local variables and arguments in scope where we want to construct our closure, this class has only one member function called apply that matches the body of the closure.

This code is precisely equivalent to the version with the closure. And in fact, it goes further than that. This version is how a compiler could implement closures. A closure is an object with exactly one method: apply. So, anytime you see a closure in code just imagine it as an object with a constructor that captures the current scope and you're there.

A few details:
  • Some may be jumping up and down to point out how inefficient the code with the class is. And they're right. But no compiler would actually do that, it's just a way to understand what's happening under the covers. Real closure implementations can be quite efficient.
  • That class implementation takes copies of the values in scope. A real implementation would actually take a reference to the exact same value. This is important because it allows the value to change between when the closure is created and when it's evaluated, and have the new value be used at execution time.
I said above that you can imagine the closure as an object. There is a consequence to that. You can take references to objects, pass those around and call methods on them at some point in the future. The exact same is true of a closure.

What do I mean and what use is this anyway? Well, in the following code I'm using an invented syntax for closures. The part in the parentheses to the left of the -> is the parameter list, and the type to the right of the -> is the return type.

void applyFunctionToEveryElement(int values[],
                                 int valCount,
                                 (int) -> int fn)
{
  for (int i = 0; i < valCount; ++i)
    values[i] = fn(values[i]);
}

void doubleList(int values[], int valCount)
{
  applyFunctionToEveryElement(values,
                              valCount,
                              function(int val)
                                { return val * 2; }
                             );
}

Here instead of evaluating the closure as soon as it is created, it's being passed into another function, which will evaluate it. Once for every element in the list. Which is exactly what you could do with an object. Big deal? Well, with that exact same applyFunctionToEveryElement function, you can also write the following functions. Oh, and applyFunctionToEveryElement is traditionally known as map, so I've used that name below - same function though, just a shorter name.

void printList(int values[], int valCount)
{
  map(values,
      valCount,
      function(int v) { printf("%d\n", v); return v; });
}

void scaleList(int values[], int valCount, int factor)
{
  map(values,
      valCount,
      function(int val) { return val * factor; });
}

Both of these functions do completely different things to a list of integers, but use the exact same function to perform the mechanical walking the list part. Congratulations, you've abstracted out iteration. The first function is pretty straight forward, it prints every value and leaves the list unchanged.

The second function is a little more tricky. The closure created there is using a local variable, which the map function has no knowledge of or access to. Remember that a closure captures all objects in scope at the time it is created, and those objects can be used anytime the object is evaluated. That's exactly what's happening here.

How that works requires a few more tricks, so I'll be leaving it for another post. It's related to why continuations and closures commonly go hand in hand. Look out for my crash course in continuations and how to implement closures coming soon.

For now, just know that this works. Closures are good, if your current language has them, use them. You won't regret it. If you, like me, find your language doesn't have closures, then...

Wednesday 28 March 2007

What Would You Do With It?

This is a question I've been thinking about a bit lately:
If you had a computer that was as smart as a human, what would you do with it?

It almost seems obvious; everyone wants a computer that has human level intelligence, people have been working towards that goal for literally 40 or 50 years, but what would they do with it if it was suddenly dropped on them?

OK, some clarification. It's a currently modern computer that is good at doing all the things computers are usually good at (arithmetic, finding files, storing lots of information) but it also happens to be really good at doing the sorts of things humans are usually good at. You know, making an 'educated' guess, pattern matching, seeing links and generally being partially intuitive.

It would not have any special hardware, it wouldn't be able to speak for example. It would look just like a computer, except for this extra bit.

So, what would you ask it? What would you do with it?

I think I can't actually answer this question as I'm not really a computer consumer anymore. Because of my job a computer is no longer just a tool, it's an end in itself. So for my purposes, I guess I'd want it to be able to write code really well. But seriously, what would you as a computer consumer want it to do?

Throwing this one open. Please comment or email me with your ideas, because I really do wonder...

Sunday 18 March 2007

shb75

Today was the 75th anniversary of the opening of the Sydney Harbour Bridge. To celebrate, the bridge was closed to traffic and people were allowed to walk across. North to South. We were there for the walk, along with a lot of other people. No rabid, right-wing monarchists though...

An icon and engineering marvel like this is really all about the photos though, so here you go.

Street

For a bit over a month, from the 20th of September 2004 until the 29th of October, I took a photo off the balcony of our place in Canberra. It was at pretty much the same time everyday, and pointing in almost exactly the same direction.

It was just an intersection down the road from us, at about peak hour.

Time lapse photography is cool though, and I managed to get some interesting shots.

21/09/2004

27/9/2004

1/10/2004

11/10/2004

19/10/2004

23/10/2004

27/10/2004

29/10/2004

Friday 16 March 2007

Giles' Third Law of Code

Do not OR two NOTs; it is probably not what you intend.
- October, 1997

Thursday 15 March 2007

Done.

Well, 10 years after first having the idea, it's now implemented. It needs work, but it's much easier to tweak and add features to an existing program than write something from scratch.


And if Joel is right and good software really does take 10 years, it must be pretty bloody good right now, eh? That's how it works, isn't it?

Tuesday 13 March 2007

Hyde Park Needs These

I hear it was for the St Patrick's Day concert on in the park last Saturday. I know it wasn't actually St Patrick's Day; it was moved because of the walk across the bridge. We'll be at that. Expect more photos.

tbpf

I was a private IT contractor when I lived in Canberra where the vast majority of the work was for government departments. At the time of writing this, I was working on a team called Revenue Re-engineering at the Australian Quarantine and Inspection Service. A friend asked me what ‘Revenue Re-engineering’ did, and the below was my response.

Before you go getting all upset and start sending me threatening e-mails and letter bombs, remember that this is my private corner of the world where I get to rant about everything I see as wrong. I may disagree with the way government does things, even when it’s the department or team I am working on, but if the direction is set out in legislation (as it is in the below case) then I have no right to interfere in my capacity as an AQIS employee. For better or worse the government is the elected representation of the people and the public service is just here to provide advice and then implement the policies of that government.

If I want things to change, then I am free to vote in elections and lobby members and senators. Until things change however, I will continue to rant and rave here.

AQIS is funded almost entirely by industry. When an inspector checks an importer's food and denies entrance because the food is full of BSE and Foot and Mouth, the importer is charged for the inspection. In fact, everytime some service is provided by AQIS to industry the company is directly charged.

Your average pig farmer has now got it into his tiny little mind that because he's paying AQIS to run it's services, then every cent he spends on AQIS is only going to be spent on himself or other tiny-brained pig farmers, not the accursed evil beef farmers down the road. Damn them and their evil beef ways!

Currently AQIS collects this money in a 'random' way. The tiny-brained pig farmer pays or doesn't pay, his choice, and AQIS notices, or doesn't notice. (So I ask you, who really is tiny-brained?) Revenue re-engineering is about changing the way AQIS collects the money from industry. Our first attempt was some software that'll supposedly make it easier to track. If that doesn't work, we're hiring some large Maori guys, and I'm sure they'll be very efficient.

The whole project shouldn't exist. The inspector who goes out is an expert in beating up flowers to find any bugs, they have science degrees and know a hell of a lot about botany, entomology and the like. Filling in all this paperwork to collect money off the TBPF is way outside his expertise, and he hates having to do it. There's this little government agency in a couple of buildings in Canberra called the Australian Taxation Office. They're really good at getting money off people, you could say that's all they do. Instead of this user pays rubbish, the ATO should just be taxing industry directly, we could even provide data feeds so they knew how many times each importer had imported something and tax off that.

Right now, the TBPF gives us $100 and insists that $50 of that be torn in administration fees to be sure that the other $50 only gets spent on him and other TBPFs, rather than see $25 go to chicken farmers, even if $75 would then go to the TBPFs.

And let's take a step back here. There's a name for this form of government spending: "User Pays." Some people will never be convinced of the moral and ethical reasons why "User Pays" is a Bad Idea, but here we have a nice solid economic rationalist reason. From the inside, it's a waste of money. The extraordinary expense in identifying which particular dollar needs to be spent on which particular program is pretty incredible. Consolidated revenue people, it just makes more sense.

Sunday 11 March 2007

A GNU Ethics?

An e-mail conversation I had recently:

I'm becoming concerned with the ethics of free and open source software.

> Why is that?

I think the entire aim of the GNU project is a bad idea. Here's my current thinking, you know most of this already, I'm still thinking it through so I want to spell it out:
  • Copyright (not patents) is intended as a means of paying people for thinking. In all fields this is regarded as a necessary thing.
  • By taking out a copyright you create a contract between yourself and your buyers to agree for your buyers not to share your 'secret' (your idea).
  • So if someone else wants your idea they come to you and pay you for it.
  • This allows a 'thinker' to 'unstrap themselves from the wheel' and earn a living in the future for original work done now, without having to work constantly to drum up that business.
  • This motivates smart people to go through some poverty now in the hope of earning a lot of money in the future.
  • This is a good thing for society as a whole, as new ideas get explored.

Up to this point we're all good; now, why GNU are wrong:
  • The underlying aim of GNU is to get rid of copyright on code. Code is an expression of an idea, like the contents of a book.
  • If there is no contract on secrets, then how can you be sure you will make money in the future if there's no reason people won't just give your code away.
  • RMS has thought of this: consulting. Programmers all become consultants, selling their services for now into the indefinite future.
  • However, consultancy doesn't create novel or new ideas, except in the very rare cases of a beneficial patron, but that only works in the art related fields: architecture and the like.
  • There is no longer any way to unstrap yourself from the wheel, and no motivation to spend the time thinking of a novel and new idea.
  • For a demonstration of this, have a look at the world of UNIX and particularly Linux. KDE has been going for nine years, and GNOME for eight years. There still isn't a decent desktop environment out of either, or any hope of one magically appearing, yet Apple was able to slap a attractive and easy to use interface on a variety of UNIX in pretty much the same timeframe (slightly less, actually.)

This is the hole we'll end up in if GNU takes over the world. That won't happen, but it doesn't have to, to cripple the software industry:
  • Currently anyone who releases a software product can only hope for it to become successful and then ripped off by some open source developers somewhere.
  • It is very easy to copy an existing product. It is very hard to add anything unique or original (look at Linux); it is even harder to come up with an original idea in the first place.
  • For an example of this behaviour look at BitMover's BitKeeper. This is a distributed, concurrent source code control system. This is a very hard problem to solve. The company only develops this, and there aren't many sales, as there aren't many places that actually need this level of power. The Linux kernel development does, and the product was generously offered to those developers for free. One of those developers then started working on an open source knock off of the client... In this case I would say there has been bad behaviour on both sides (I disagree with the restraint of trade clause BitMover included in the free version license) and Linus has been chirping in with some bizarre statements…

I'm just concerned that open source development is going to cause software development to completely stagnant by preventing small developers from actually making money off an original idea... Once those small companies have been killed there will be very, very few original ideas emerging.

> What about the cost of software and the fact that companies like us, > for example, are unable to afford the massive startup cost of using MS > or Oracle products?

  1. We don't need the power of Oracle.
  2. We could use MS (Web Edition + SQL Server 2005 Express)
  3. What about all the little companies that could develop and offer cheap and innovative low-end solutions for small companies like us, that have been squeezed out of the market by open source alternatives? Fine, if the OS software was there first, you just have to find a very good way to compete when entering a market like that. But, what about when the open source version is a followup knock off to your very good product?
  4. Arguing that open source should remain as it allows small companies to save money doesn’t really deal with any of the ethical issues.

You could argue that companies will just have to deal with the potential for free knock-offs to come along, and youll just have to accept that and find better ways to compete, by continually making yours better. Just deal, it's a new form of competition, essentially.

And that's true, however, I believe it's worth thinking about the damage this approach is doing to our industry. Just how much innovation is being stifled?

Oh, and people who read this and are actually involved in any of the stuff I'm talking about are probably going to find this pretty incendiary. So, I won't be replying to any comments directly. If there are enough (intelligent) comments, I might do a follow-up post. But, then who am I kidding? No one really reads this anyway...

Tuesday 6 March 2007

How and Why Did I Get into PLT?

PLT = Programming Language Theory, btw.
Programming languages have always appealed to me. Right from the beginning on the Apple //e I could tell that there was a substantial difference in their expressive power, though I didn't know that was what I was thinking at the time. I started to get more seriously interested interested when I did the unit in compilers (SIT311) at Uni, and realised that a compiler was a very useful approach to solving a lot of problems. But there were really three things that got me thinking in depth about PLT.

1) When rewriting ERT as Expert 6 we made a conscious and concerted effort to use all the language and library features of C++ that were appropriate. Initially this was about not re-inventing the wheel: if there was a good string class in the standard library then use that instead of writing our own. Overtime we realised that we could go further: using the language and library features as 'Barney' intended meant that we could occasionally prevent incorrect code in the runtime from compiling. This didn't happen anywhere near as often as I would have liked, so I started thinking about ways that a language could have saved us more.

2) Expert itself is actually a runtime to execute compiled programs written in a specialised logic language. Being exposed to and working closely with this sort of special-purpose language was an interesting experience. I could quickly see that using this kind of special-purpose language for rules was far more effective than trying to write the rules in a general-purpose imperative language. I even had arguments with other employees about how the runtime was not just 'a big switch statement.' There was one other developer who wanted to use Expert to write complex business rules other than legislative rulebases. I was very excited by this, but it never happened. While working on the Expert runtime and in particular debugging rulebases, I really noticed how the typing was inconsistent and extremely weak. Functions and in particular user-defined functions were also hopelessly constrained. I started to think about how typing could be done properly.

3) Strangely enough this is the main point that really introduced me to formal programming language theory. I read jwz's critique of Java (java sucks) and I really wanted to understand everything he said there. Mainly, what was a 'long-lived closure' and why did he miss them so much? Or a 'downward funarg?' So I started reading and searching. Eventually I stumbled upon Lambda-the-Ultimate and found a huge world of other terms and concepts. By this time emtap had evolved from a framework into a language and while reading LtU I discovered that a lot of these concepts would immensely help with the design and implementation of emtap.

Monday 5 March 2007

Mardi Gras!

The famous Sydney Gay and Lesbian Mardi Gras was last weekend. And I mean famous in the real sense, not the St Kilda Rd sense. We live literally just down the road from the main parade route, so I wandered up there to have a look.


When we were telling friends from Canberra where our place was someone asked "Will you be going to see the Mardi Gras?"

"No need; it's coming to us." And I'm glad it did. Obligatory window shot - there's a lot of this sort of thing living where we do.


Unfortunately, I was all on my own :-( but there was a fantastic party atmosphere. It'd be great to go with a group; and go early. You want to be able to get close to the barricades. Otherwise, you really won't see much.

Some photos. And again, sorry about the quality. There wasn't a lot of light, and I was holding the camera above my head and pointing it in vaguely the right direction. Oh well. Guess you'll just have to come see for yourself.