Haskell Liftoff

I’ve been reading about Haskell for a year or so, but it’s only lately that I’ve really started to do any coding in the language. It’s really good to finally write some code of my own and see what the compiler thinks about it.

I had to get a few things set up on my computer before I was motivated to jump in; one was the Haskell Platform, another was the Haskell plugin for Eclipse, and the third was the Gtk2Hs packages for hooking Haskell up to OpenGL. I wanted to do something with Haskell that I could see on the screen, so I started working with The Haskell School of Expression: Learning Functional Programming Through Multimedia. Gtk2Hs offers a graphics library soegtk that specifically supports, and maintains API compatibility with, this book.

Anyway I thought I’d relate some of what I’ve been going through. I hope that even if you know Haskell, there may be something of interest here.

I do have a bit of a leg up because I’ve become familiar with certain basic things through reading, but there’s nothing like actually getting down to it. Still, I had to go lie down for a while after reading Exercise 2.4:

Define a function convex :: Shape -> Bool that determines whether or not its argument is a convex shape (although we are mainly interested in the convexity of polygons, you might as well define it for each kind of shape).

Right! What’s a cubit?

OK so at least I should show how far the book had brought me along before hitting me with this. We started with some type definitions:

type Radius = Float
type Side = Float
type Vertex = (Float, Float)

These are just typedefs. They don’t enforce anything. Radius and Side are both actually Float, so if you mix them up, Haskell won’t care. They’re just convenient aliases for existing types that help you express your intent, as we’ll see in the definition of Shape below.

That last type, (Float, Float), is a tuple: a pair of Floats. A tuple has a fixed length, and a fixed type in each position.

data Shape = Rectangle Side Side
           | Ellipse Radius Radius
           | RtTriangle Side Side
           | Polygon [Vertex]

Unlike type, the data keyword does define a new type. This type is named Shape, and the things separated by the | symbols — Rectangle, Ellipse, RtTriangle, and Polygon — are constructors.

In Java, C++, and ActionScript, constructors must have the same name as the type they construct. In these object-oriented languages, types are realized as classes. In ActionScript, in fact, the constructor is so closely identified with the class that a class can have only one constructor. (This is true of haXe as well — but in haXe the constructor must be named new.) In Haskell, one constructor may have the same name as the type it constructs, but that wouldn’t be useful here because we’re not interested in constructing an abstract Shape. (I don’t know enough about Haskell to say that we would never want to do that, though.)

The type names following the constructor constitute its signature; thus Rectangle takes two parameters of type Side, and so on. Haskell doesn’t use parentheses to indicate function application. So clean.

That last type, [Vertex], is a list of Vertices. Unlike a tuple, a list can have only one type of element, but it may have any number of elements, from none (the empty list []) to infinity (for example the list of positive integers [1..]).

The next thing I want to mention, though I didn’t hear it from the SOE book, is this: constructors don’t have to have any parameters.

So what does that look like? How about this:

data Bool = True | False

And there you have it. Haskell has no need of a special Enum type or keyword. It’s just part of the system.

If you’re a Flash programmer who’s been looking at Enums in haXe you may have been bedazzled by the fact that the enums have constructors and the constructors can have parameters. Well, we’ve just seen that little mashup from the FP side of things.

Moving back to geometry, we now define our first function. The exercise is: “define rectangle in terms of Polygon.” (Types and constructors start with an upper-case letter; functions start with a lower-case letter. This is not just a convention.) First we’ll declare the type of the function:

rectangle :: Side -> Side -> Shape

OK this looks pretty weird from the imperative-language perspective. What it says is that rectangle takes two arguments of type Side and returns a Shape. But the -> symbol appears to have two different uses: separating the argument types from each other, and separating the argument types from the return type. (It’s like that in haXe, too!) We’re going to explain that little quirk after a while, but for now let’s just put up a stickie and push on.

rectangle s t = Polygon [(x, y), (-x, y), (-x, -y), (x, -y)]
                where x = s/2
                      y = t/2

This is pretty much plain English or plain Mathish anyway, so I’ll just note that in Haskell it’s considered cool to use one- and two-letter variable names.

You should now be able to define the functions square and circle.

How about another function:

area :: Shape -> Float

There are four different ways a Shape can be constructed, and they all have different formulas for calculating the area, so how do we write this function? You might think we’d have to say something like

area s = if s is a Rectangle then bla bla bla...

But no, what we do in Haskell is provide several equations for the function, with different parameter specs. When it’s time to execute the function, Haskell will go through the equations and use the first one that matches the actual arguments.

area (Rectangle s1 s2) = s1 * s2
area (RtTriangle s1 s2) = s1 * s2 / 2
area (Ellipse r1 r2) = pi * r1 * r2

The pattern matching at work here is just one of the ways that the flow of control is built into the language. In many ways, the programmer takes care of getting the data structures right, and the compiler handles the branching, the looping, … the CONCURRENCY… In this case we’ve seen Haskell take care of some branching for us.

Now we also want the area function to handle the last kind of Shape, the Polygon. Polygon is defined by a list, so before we can write that equation we have to back up a bit and take a digression about lists, how they’re represented, and how we can operate on them. This is going to show us how we can implement loops in a language lacking for or while — well, it’s no secret really, we use recursion.

In the rectangle definition above, we constructed an explicit list of four Vertices. When we pass a list into a function, we’re going to want to have a way to deconstruct it. So let’s introduce the : operator, pronounced “cons.” If xs is a list and x is another element, then x:xs is a new list whose head (first element) is x and whose tail (everything after the first element) is xs.

head (x:xs) == x
tail (x:xs) == xs

x and xs are very common variable names in Haskell. xs is pronounced “exes.” A list of xs.

These next few examples are from Real World Haskell.

-- square every element in the list
squares :: [Float] -> [Float]
squares (x:xs) = x * x : squares xs
squares [] = []

This function is complete, because the second pattern matches the empty list, and the first pattern matches any other list of Floats. Notice that x:xs matches when the list is just [x] — in this case xs is the empty list [].

x:[] == [x]

Optimization nerd alert: recurring on the tail results in what? Tail recursion! Which gives me the opportunity to mention another thing that Haskell (and other functional language implementations) do for you: they optimize tail recursion to run in constant space. The recursion doesn’t grow the stack like it does in your imperative language. So we’re cool there.

upperCase :: String -> String
upperCase (x:xs) = toUpper x : upperCase xs
upperCase [] = []

By the way,

type  String  =  [Char]

, so squares and upperCase have exactly the same structure. What’s different? The type of the list elements, and the function that’s applied to each element. Let’s generalize squares by extracting as a parameter the function that it applies:

-- f every Float in the list
ff_it :: (Float -> Float) -> [Float] -> [Float]

ff_it is a function that takes two arguments:
1. a function that takes a Float and returns a Float, and
2. a list of Floats;
and returns a list of Floats.

ff_it f (x:xs) = f x : ff_it f xs
ff_it _ [] = []

The underscore in the last pattern is a wildcard; if we fall through to this pattern we don’t need to match the f, because we won’t be using it anyway.

SO: if we define a function square

square :: Float -> Float
square x = x * x

then instead of using our special squares function we can say

-- square every Float in the list:
squared_xs =  ff_it square xs

and similarly

negate :: Float -> Float
negate x = -x

-- negate every Float in the list:
negated_xs = ff_it negate xs

Of course it gets tedious to write ff_it this and ff_it that all the time, so let’s bring back our squares function by defining it this way:

squares = ff_it square

Where’s the xs, though, right? We defined ff_it to take two arguments…

…OK so now we have to go back to the arrows in the function signatures and come clean about what they really mean.

Here’s the thing: every function in Haskell takes just one argument.

The -> operator is right-associative. The signature

multiply :: Float -> Float -> Float

is identical to

multiply :: Float -> (Float -> Float)

which we already know means “a function that takes a Float and returns (a function that takes a Float and returns a Float).”

multiply x y = x * y

add x y = x + y

As long as we supply at least one argument, any Haskell function will return something.

double :: Float -> Float
double = multiply 2

increment :: Float -> Float
increment = add 1

This is called partial application.
Can you define a Haskell function that takes no arguments? I don’t think so. Certainly there’s no function that just “does something” without returning anything. There’s no “Void” type for it to not return. (By the way, this is why in AS3 they changed Void to void: it’s not a type, it’s the lack of a type.)

Now let’s generalize upperCase the same way we generalized squares:

-- f every Char in the list
fc_it :: (Char -> Char) -> [Char] -> [Char]
fc_it f (x:xs) = f x : fc_it f xs
fc_it _ [] = []

And we now have two functions that are structurally the same but differ in only one thing: that type that appears as the argument and return types of f and as the element type of the list.

It would be great if we had variables that represent types, right?

-- f every a in the list
f_it :: (a -> a) -> [a] -> [a]
f_it f (x:xs) = f x : f_it f xs
f_it _ [] = []

Using our now-uncloaked knowledge of the -> operator, this is how we read the type declaration:
f_it is a function that takes:
.. a function that takes some type a and returns the same type;
and returns:
.. a function that takes a list of that same type a and returns a list also of that same type.

It’s customary to use letters from the beginning of the alphabet for type variables.

squares = f_it square
upperCase = f_it toUpper

There’s one more generalization of f_it that’s cool to make that wasn’t revealed by squares and upperCase, because they return the same type as their last argument (talking about it in the old pre-enlightenment way). But they could return a different type, say b, and we’d be covered by this more general and equally strongly-typed definition:

map :: (a -> b) -> [a] -> [b]
map f (x:xs) = f x : map f xs
map _ [] = []

(Of course this includes the case where a and b are the same.)
Whew, all that just to get to the map function that’s in the standard library and that you probably know about already in some other language! Too fast?

So clean. Just look at what a hash ActionScript makes of it: AS3 function map(callback:Function, thisObject:Object = null):Vector.<t>
HaXe does a somewhat better job : Functional haXe

So I hope you noticed that
a. Every function has no side effects either in or out; its action can’t change or be changed by anything external. Its return value depends solely on its inputs. They’re called pure functions.
b. no variable ever changes value. They’re immutable.

Ya think that might be thread-safe? Ah, the concurrency!

Now no useful language can have only pure functions; there would be no way to do any IO! So Haskell does have impure functions, but it makes it very clear where the membrane lies. In fact you can tell an impure function from its signature alone: it will always have the word “IO” in it!

I see I haven’t gotten back to the convex function, or even to the area of a Polygon, but I’m afraid our time is up for this week. If you thirst for more of this saga, leave a comment begging for it.

PS If you had any trouble with the square and circle definitions, here they are:

square s = Rectangle s s
circle r = Ellipse r r

  • http:blog.debreuil.com Robin Debreuil

    wata… … wata… beg…

  • http://robertpenner.com Robert Penner

    Love this article, especially since I’ve been reading up on Scala the last few weeks. I appreciated you sharing that you’ve been reading Haskell for a year; I was starting to worry that I was taking too long reading Scala without writing any.

    It’s interesting to see how much Scala is inspired by Haskell. I love how both challenge many assumptions about code structure and legibility. “You don’t actually need _____!” Does terseness make code easier or harder to understand? It depends on the situation, the language and who’s reading it. Haskell has a minimalist chic that will be Morse mush to some and Mondrian to others.

  • alan

    Thanks, fellas.
    @Robert: In the short time that I’ve actually been coding in Haskell I’ve already appreciated the power and relative ease of exploiting polymorphism to generalize functions and get closer to the essence of algorithms. If I get around to the next installment I’ll show how I was able to apply the same kind of abstraction that led to the map function to build up a set of functions for Polygons and then collapse that set into a single parameterized pattern. I have no doubt that my brilliant insights are the common stuff of some standard library; what I’ve done is barely the Hello World of Haskell.

    The patterns of programming stand out more when you reduce the clutter. You know how in ActionScript a loop becomes clearer and cleaner when you can say for each… instead of having to introduce an index that serves only to get you from one element to the next. Needing extraneous information just to make the program work is the definition of a low-level language. Here of course we have that same process of clarification taken somewhat further. And beyond polymorphism there’s something called volcano monitoring I mean polytypism, which I’ve just read a post about: “using induction on the structure of types.” What’s a cubit indeed!

    All this paring down was academically motivated for the purpose of facilitating reasoning about programs and proving program correctness. I’m interested in that as far as I can understand it, but at the same time I REALLY want to find out how Haskell works in the real world. There are people who have jobs programming in Haskell!

    But yeah, who’s reading it…

  • http://www.xxcoder.net Stray

    Fantastic post – you can tell you know a thing or two about teaching :)

    Syntax-sparse languages are something I’m still not sure about. Punctuation is an aid to comprehension (so my Penguin guide to semi-colons tells me) – how much is ‘just enough’? (Asked Goldilocks-programmer)

    Teaching AS3 reminds me of learning Russian (as an English-as-first-lang person) – you have to learn a whole new letter system before you can even begin with the words and phrases. Haskell and co (even Python and Ruby) are more like Spanish – you can pretty much get by with the 26 letters you know.

    This leaves me confident about *reading* the language, but making a total mess when I write it.

    But I love the whole way of thinking. It’s like double-distilled programming. Whatever language you earn your pennies in, it must be useful to have that mode of understanding sitting in your subconscious, poking you.

    All my multi-lingual (in the spoken sense) friends talk about the usefulness of being able to *think* in more than one comprehension-space. They might not use Bengali or Polish in their day-to-day conversations but they think a different kind of thought when their internal narrative switches tongue.

    There’s a Haskell bundle for TextMate… you’ve inspired me to get my toes wet! Please please please please write your next instalment soon :)

  • Patrick W.

    Ah, the memories of functional programming, list crunching and recursion with Miranda… it all comes back thanks to this fascinating read :)

    Thank you Alan, thank you professor Turner at UKC!

    Now I am eagerly looking forward to your next posts: how does visual programming, especially interactive graphics fit into the functional programming world?

    And more importantly: how can we use all this clean, functional power in the dirty realms of the Flash Player… is there an Alchemy solution by any chance?!

  • Pingback: A Year Of Scala « blog.joa-ebert.com – Blog of Joa Ebert

  • Shawn Morel

    great writeup. one point about the void type. It’s not actually the non-existence of a type but rather the Unit type:

    http://en.wikipedia.org/wiki/Unit_type.

    well almost, you’ll notice that ‘void’ in most c-like languages is not quite the same as the unit type.

    2 other articles you might enjoy along the lines of function arguments and types:
    http://blog.ezyang.com/2010/12/hussling-haskell-types-into-hasse-diagrams/

    http://blog.ezyang.com/2010/12/how-i-learned-to-stop-worrying-and-love-the-bottom/

  • Alan

    Thanks, Shawn, those links are giving me some food for thought indeed!

A sample text widget

Etiam pulvinar consectetur dolor sed malesuada. Ut convallis euismod dolor nec pretium. Nunc ut tristique massa.

Nam sodales mi vitae dolor ullamcorper et vulputate enim accumsan. Morbi orci magna, tincidunt vitae molestie nec, molestie at mi. Nulla nulla lorem, suscipit in posuere in, interdum non magna.