Coding for the Big Picture: April 2010

Wednesday, April 28, 2010

Dirty Tricks

Here is a type I'm using to record stats.

data UserzFormationViews

= UserzFormationViews

{ formationsViewed :: [(FormationId, UTCTime)]

, ...

}

Basically, the data is a monoid. Actually, it's even less restrictive than that, because the monoidal append on this piece of data is commutative (i.e. embarrassingly parallel). It's incredibly convenient to be able to just throw a pile of these together and tabulate the stats with a familiar mconcat.

At other times, it's useful to get all the user's views, in a form like this:

class RetrieveViews a where

getViews :: a -> [(UserId,FormationId,UTCTime)]

This obviously messes with the monoidal definition of UserzFormationViews. The initial object (i.e. mempty) would need a dummy value for the UserId. In more primitive languages, you'd just leave that as NULL, which is a dirty hack. And we don't like those kinds of dirty hacks in Haskell. What do we do instead?

instance RetrieveViews (UserId, UserzFormationViews) where

...

vs = getViews (zeUserzId,zeUserzViews)

So wrong and so awesome at the same time. :)

Wednesday, April 21, 2010

A Problem I've Encountered with Types

Everyone is familiar with how we can use types to encode checks into our code. For example:

data User
  = SystemUser SystemProcessInfo
  | AnonUser TimeLoggedIn
  | RegisteredUser UserId

data UserId
  = PaidUser (..)
  | FreeUser (..)

More generally, you can use type witnesses to mix together similar types in the same list, to mimic some aspects of dynamic typing.

The Problem:

Suppose you have a type witness, such that all the witnessed types derive some typeclass (or a similar situation where you are using types to enforce behavior). Is it possible to have the compiler automatically derive that typeclass for the type witness, i.e. something similar to GeneralizedNewtypeDeriving? This kind of boilerplate is yucky:

data A
  = A1 X
  | A2 Y
  | A3 Z

instance Foo A where
  getFoo (A1 x) = getFoo x
  getFoo (A2 x) = getFoo x
  getFoo (A3 x) = getFoo x

One area where this problem seems unavoidable is when you have to deal with collated data of different types. For example, say you have a list of items that were found to be complementary in terms of content, but are of different types. In such cases, type witnesses are unavoidable. If I can get a good solution to this problem, I'll be sure to post it.

Friday, April 9, 2010

Lazy IO with Haskell Monads

Haskell's treatment of imperative programming and side effects is possibly the most derided feature of the language. As someone who's written desktop apps, I'm certainly aware of the inflexibility that Haskell's sandboxing imposes. However, monads happen to mesh very nicely with Haskell's purity and laziness, and what they take away, they give back in other ways.

Suppose for example, you are designing an evil robot that goes out and steals wheels from people's cars. The robot can steal up to 4 wheels from a car, but because it tries to avoid being detected by witnesses (organic or otherwise), it may abort the job prematurely. You need it to fetch 20 wheels a night. No more, no less. Fewer, and you may not be able to pay the bills. More, and the police might start catching on.

For this simple example, you have a function that returns a monadic value:

stealWheelz :: Coordinates -> IO [Wheel]

Then you have some other pre-calculated list of random locations that you want the robot to steal from.

locations :: [Coordinates]

Now here's the funky part. You can define a function takeM, as such:

takeM :: (Monad m) => Int -> [m [a]] -> m [a]

takeM = takeM' []

takeM' :: (Monad m) => [a] -> Int -> [m [a]] -> m [a]

takeM' acc 0 xs = return acc

takeM' acc n [] = return acc

takeM' acc n (x:xs) =

do rs <- x

let rs' = take n rs

n' = n - length rs'

takeM' (acc ++ rs') n' xs

and then your list of tires stolen can simply be:

takeM 20 (map stealWheelz locations)

With this code, the robot will only steal as many tires is as necessary, even if the list of coordinates is of infinite length. Because Haskell is lazy, and IO routines (and all imperative programming that are modelled as monads) are treated as lazy values, you can pass around tasks, with all their arguments bound--even in a strict monad like IO--, and not have to deal with a lot of the usual glue that goes into determining how many results to return.

Btw, that means you can actually pass around the value:

let stealSomeWheelz = map stealWheelz locations

and take results from that list of actions only as you need them. This sort of thing is doable, but *much* less elegant in languages that do not treat actions as if they were values. When you design a language to operate via side effects, then you have to deal with those side effects immediately, as you arise. That means you can't easily reason about it at a higher level, even when you're trying to do simple things like "take the first n results".

As an example, I can apply a filter function to those results, as such:

-- Police commissioner is in town tonight with his Escalade

let stealSomeWheelz = map stealWheelz locations

filter noCaddyWheelzTonight <$> stealSomeWheelz

I discovered this trick when I needed to sanitize the first n queries for a particular area of interest, given a statically-ranked listed of sources. In a more general setting, removing the glue from imperative programming means that I can focus more on what I'm doing, rather than what I'm programming. That's a good thing.

I'd love to see more gems like this. I'm sure someone else has discovered this already, it's in a library somewhere, and I'm just yakking the obvious. I have a feeling that the functional community (and the rest of the world by proxy) is only beginning to discover the value of treating imperative routines as if they were monadic values. Indeed, at the time of this writing, there's a video on HN concerning monads in Clojure.

Coding for the Big Picture