Monday, November 8, 2010

An import trick too useful to pass up

So here is a trick I learned from reading the Snap code over the last week.

Namespace collisions suck. Data.Map, Data.Set and Data.List all have fairly similar functions that we all know and love to use, and they differ subtly, so people often import them qualified, i.e.

import qualified Data.Map as M
import qualified Data.Set as S
import qualified Data.List as L

Now the annoying thing about this is that then you have to prepend the type signatures too, e.g.

foobar :: S.Set a -> a -> S.Set a

This is pissy, so what some genius who worked on Snap did was:

import Data.Map (Map)
import qualified Data.Map as M
import Data.Either (Left,Right)

Now this sounds simple and all, but it actually works Much Better in practice than in theory, partly because type constructors like Left and Right rarely overlap from module to module.

Posted via email from lambdasquirrel's posterous

How *do* you tell good code from bad anyway?

So here's one straightforward metric for code cleanliness. Is there code that could've been purely functional that can't be easily be extracted from the monadic wrappers you placed it in? I guess a similar thing for OCaml would've applied to the OO and functional code.

This is probably not as easy as it seems though, because there's plenty of stuff that would've looked like it could've been pure but is actually much better off effectful. Sometimes there's no substitute of experience.

Posted via email from lambdasquirrel's posterous

Friday, September 10, 2010

Untitled

A few months ago, there was a useful tutorial on using CouchDB with Haskell. You can find the original here.

One weakness of these DB layers is that you have to verify your data, and the APIs to the input data are usually not typesafe. As it turns out, it's incredibly easy to use type-level programming to make your DB calls typesafe.

Lets take an idealized version of a typical DB get call:

getDBUnsafe :: (JSON a) => DB -> String -> IO a
getDBUnsafe = undefined

There's two weak spots here. The first is the part where we pass in the DB key, and the second is when we use the value. The latter is more insidious than the former, because whatever it is you're using to parse your JSON, it's probably a pure function, so if you're coding in the usual expedient way, you'll have no idea why the parse is failing, when the real cause is that you're fetching from the wrong DB.

So here's the framework for a solution that uses FunctionalDependencies and MultiParamTypeClasses to impose a constraint on the type of the DB key and stored value, based on the type of the DB.

class (JSON v) => DBTy a k v | a -> k, a -> v where
getDBName :: a -> String
getKey :: a -> k -> String

getDB :: (DBTy a k v) => a -> k -> IO v
getDB db k =
getDBUnsafe (getDBName db) (getKey db k)

So how we use this? Well you just define a dummy type for a DB like such:

type UserId = Int

data Avatar
= Avatar ByteString
deriving (Eq, Show, Ord, Typeable, Data)

instance JSON Avatar where
showJSON = toJSON
readJSON = fromJSON

data AvatarsDB
= AvatarsDB String

instance DBTy AvatarsDB Int Avatar where
getDBName (AvatarsDB name) = name
getKey _ = show

 

After that, you just replace your unsafe DB calls that look like this:

v <- getDBUnsafe "avatars" userId

with this:

v <- getDB AvatarsBB userId

 

Oh and here's the stuff you need to paste at the top of all this to get it to compile.

{-# LANGUAGE FunctionalDependencies, MultiParamTypeClasses, DeriveDataTypeable #-}
module FundepsExample where

import Data.ByteString.Lazy

import Text.JSON
import Text.JSON.Generic

type DB = String

 

I know it's a pretty silly example and use case, but I was surprised that there were folks in my local Haskell meetup who hadn't seen it, so I thought I should share it. It's saved me no end of errors ever since I put it to use.

Now in practice, I've found it more useful to create a KeyStringTy class instead of using that getKey bit. As it so happens, the sort of stuff I typically use to index my bit bucket database are also the sort of stuff that I use in RPC calls from the web. It's probably less correct, but it sure was dandier to code. Watch it bite me in the ass someday.

Posted via email from lambdasquirrel's posterous

Monday, June 21, 2010

Mike Rowe Celebrates Dirty Jobs

I couldn't agree with this guy more. The cultural dynamic has led to more than a few awkward conversations about how programming doesn't interest you anymore. Then there's some equally silly remark on how one could remedy the problem with -insert-deux-ex-machina- here. Or an even more awkward defensive remark about how X is still interesting, the other guy just hasn't seen aspect A of X.

It's nearly impossible to talk about things like work ethic and the psychology of motivation with such a dynamic. People tend to fall into one of three camps: that they do work simply because it has to be done (i.e. it's a responsibility, so shut up), that they do their work because of the money it earns them, or that they'll only do those things that pulls their heartstrings. You don't get very far when a conversation usually devolves into some sort of argument over philosophies or personality traits.

Perhaps that's why Self-Determination Theory works so well: it bypasses the problem entirely. In the context of a conversation however, I wonder whether the smart thing to do is to keep quiet because there does not seem to be a right answer. I personally came from the third camp, but doing a startup quickly teaches you that passion alone can't carry you the whole way.

I wonder how many other creative industries suffer from this problem, and in what way, because I'm sure that the problem manifests itself differently depending on the field.

Posted via email from lambdasquirrel's posterous

Wednesday, May 19, 2010

Haskell & STM, Why no Applicative?

There was a fellow on #haskell the other day who was apprehensive about learning STM. Should he learn it after learning category theory? We assured him that Haskell's STM was fairly simple, whereas category theory is a dense liberal art that you study to enrich your mind. Someone pointed him to the wiki and he went on his way.

Afterwards, I perused the wiki (again). What struck me was that there was no example there that show you how to convert a plain old bunch of IO routines into STM routines. When I went to make one myself, I realized how brain-dead easy it is, but that there's something missing...

Anyway, here's a chalked up example where someone has to perform multiple time-consuming tasks (preferably in parallel), with the interesting routine in bold:


import Control.Applicative
import Control.Monad
import Control.Concurrent
import Control.Concurrent.STM
import Control.Concurrent.STM.TMVar
import Data.DateTime
 
main :: IO ()
main =
    do putStrLn "Without STM"
       t1s <- getCurrentTime
       stuff <- withoutSTM
       t1e <- getCurrentTime
       putStrLn $ show stuff
       putStrLn $ "That took " ++ show (diffSeconds t1e t1s) ++ " seconds"
 

-- this is the routine that actually does stuff
withoutSTM :: IO GroceryStore
withoutSTM =
    do a <- getTomatoesCountFromDB
       b <- haveFreshBerries
       c <- getNameOfCurrentStore
       return $ GroceryStore a b c
 
getTomatoesCountFromDB :: IO Int
getTomatoesCountFromDB =
    do milliSleep 1000  -- simulate slow DB read
       return 5

haveFreshBerries :: IO Bool
haveFreshBerries =
    do milliSleep 1000
       return True

getNameOfCurrentStore :: IO String
getNameOfCurrentStore =
    do milliSleep 1000
       return "Tom's Produce"

data GroceryStore
    = GroceryStore
      { numTomatoes :: Int
      , freshBerriesInStock :: Bool
      , nameOfStore :: String
      }
    deriving (Eq, Show, Ord)
 
-- helpers
milliSleep = threadDelay . (*) 1000
 
 
 
So there you have the plain old imperative code.
 
And here is the same code modified to use STM.
 
 
-- getTomatoesCountFromDB, haveFreshBerries, getNameOfCurrentStore are unchanged from before

-- this is the modified routine
withSTM :: IO GroceryStore
withSTM =
    do a <- stmFork getTomatoesCountFromDB
       b <- stmFork haveFreshBerries
       c <- stmFork getNameOfCurrentStore

       GroceryStore <$> (stmWait a)
                    <*> (stmWait b)
                    <*> (stmWait c)

main :: IO ()
main =
    do putStrLn "With STM"
       t2s <- getCurrentTime
       stuffWithSTM <- withSTM
       t2e <- getCurrentTime
       putStrLn $ show stuffWithSTM
       putStrLn $ "That took " ++ show (diffSeconds t2e t2s) ++ " seconds"
       return ()

-- more helpers
stmWait = atomically

stmFork :: IO a -> IO (STM a)
stmFork m =
    do tmv <- newEmptyTMVarIO
       let m' = m >>= (atomically . putTMVar tmv)
       forkIO m'
       return $ readTMVar tmv
 
 
 
Now the folks who've been playing with this for a while won't find this particularly remarkable, but I wouldn't have used it a year ago when I was first learning Haskell, so I thought someone should make note of it. There are also two important things to take away from this.
 
1. You could only do this if imperative routines are treated as values. The pons asinorum that Eric S. Raymond was referring to is not without its benefits.
 
2. Would an applicative instance for STM be nicer? At the least, we might be able to do code like this instead.
 
       stmWait $ GroceryStore <$> a <*> b <*> c
 

What would be wrong with the naive implementation of this?  e.g.:
 
(<*>) :: (Functor m, Monad m) => m (a -> b) -> m a -> m b
(<*>) mf ma =
    do a <- ma
       f <- mf
       return $ f a
 
 
Again, I'm probably way out of my depth here, so I apologize profusely if this is asinine.

Posted via email from lambdasquirrel's posterous

Wednesday, May 5, 2010

Academia isn't Broken. We Are.

I saw this piece on Hacker News this morning. It struck a chord, but something wasn't right about it.

http://brucejacob.tumblr.com/post/373498114/academia-and-the-decline-of-wealth-in-america

If academia is contributing to the lack of innovation in this country, then maybe it's because we expect the wrong things from academia? I don't mean to say this as another pompous American, but when I used to chat with friends from abroad back in school, I was struck by how many of them had a uniform educational experience. This wasn't a blanket effect and there were more than enough exceptions to produce many of the most awesome researchers I've met. Back at home though, even the countries that had effective programs to retain their top talent suffered from a lack of innovation.

By contrast, there is no standard curriculum at the top 5 US universities for CS. But most of the kids coming out there are shills anyway. I went to one of the good schools, and many kids (and their parents) were concerned about whether what they were learning would "prepare them for the real world". That basically meant: did it teach you Java or consulting? You see what's happening? What our education system didn't do to them, their own expectations of college did, and sadly, they seem to have done it to themselves just as badly as the education system of those foreign countries did.

But education isn't about churning out stamped spoons, and that's why crap like that No Child Left Behind Act bothered me so much. Where we went wrong is that we began viewing education as something everyone is entitled to, for all the wrong reasons. Education is not factory farming. Steve Jobs took calligraphy because he thought it was interesting, and no one thought what the Google guys were doing was "practical". I'll wager that the Google guys did it because it seemed like nerdy awesomeness to transform web search into a giant linear algebra problem. Just as Academia is about exploring the boundaries of what we know, Education is about enriching one's thought process, and that's the leading source of innovation in our modern economy: good ideas from the fringe, implemented intelligently and autonomously.

The entitlement of success that seems to follow from attending college is what's broken. The expectation that you will get a cushy 9-5 job in return for that diploma is what's broken. It in essence is a laziness of the mind, an unwillingness to chart out one's own path, the very idea of which is quite unacademic.

Wednesday, April 28, 2010

Dirty Tricks

Here is a type I'm using to record stats.

data UserzFormationViews
= UserzFormationViews
{ formationsViewed :: [(FormationId, UTCTime)]
, ...
}

Basically, the data is a monoid. Actually, it's even less restrictive than that, because the monoidal append on this piece of data is commutative (i.e. embarrassingly parallel). It's incredibly convenient to be able to just throw a pile of these together and tabulate the stats with a familiar mconcat.

At other times, it's useful to get all the user's views, in a form like this:

class RetrieveViews a where
getViews :: a -> [(UserId,FormationId,UTCTime)]

This obviously messes with the monoidal definition of UserzFormationViews. The initial object (i.e. mempty) would need a dummy value for the UserId. In more primitive languages, you'd just leave that as NULL, which is a dirty hack. And we don't like those kinds of dirty hacks in Haskell. What do we do instead?

instance RetrieveViews (UserId, UserzFormationViews) where
...

vs = getViews (zeUserzId,zeUserzViews)

So wrong and so awesome at the same time. :)

Wednesday, April 21, 2010

A Problem I've Encountered with Types

Everyone is familiar with how we can use types to encode checks into our code. For example:

data User
= SystemUser SystemProcessInfo
| AnonUser TimeLoggedIn
| RegisteredUser UserId

data UserId
= PaidUser (..)
| FreeUser (..)


More generally, you can use type witnesses to mix together similar types in the same list, to mimic some aspects of dynamic typing.


The Problem:


Suppose you have a type witness, such that all the witnessed types derive some typeclass (or a similar situation where you are using types to enforce behavior). Is it possible to have the compiler automatically derive that typeclass for the type witness, i.e. something similar to GeneralizedNewtypeDeriving? This kind of boilerplate is yucky:

data A
= A1 X
| A2 Y
| A3 Z

instance Foo A where
getFoo (A1 x) = getFoo x
getFoo (A2 x) = getFoo x
getFoo (A3 x) = getFoo x


One area where this problem seems unavoidable is when you have to deal with collated data of different types. For example, say you have a list of items that were found to be complementary in terms of content, but are of different types. In such cases, type witnesses are unavoidable. If I can get a good solution to this problem, I'll be sure to post it.

Friday, April 9, 2010

Lazy IO with Haskell Monads

Haskell's treatment of imperative programming and side effects is possibly the most derided feature of the language. As someone who's written desktop apps, I'm certainly aware of the inflexibility that Haskell's sandboxing imposes. However, monads happen to mesh very nicely with Haskell's purity and laziness, and what they take away, they give back in other ways.

Suppose for example, you are designing an evil robot that goes out and steals wheels from people's cars. The robot can steal up to 4 wheels from a car, but because it tries to avoid being detected by witnesses (organic or otherwise), it may abort the job prematurely. You need it to fetch 20 wheels a night. No more, no less. Fewer, and you may not be able to pay the bills. More, and the police might start catching on.

For this simple example, you have a function that returns a monadic value:

stealWheelz :: Coordinates -> IO [Wheel]

Then you have some other pre-calculated list of random locations that you want the robot to steal from.

locations :: [Coordinates]

Now here's the funky part. You can define a function takeM, as such:

takeM :: (Monad m) => Int -> [m [a]] -> m [a]
takeM = takeM' []

takeM' :: (Monad m) => [a] -> Int -> [m [a]] -> m [a]
takeM' acc 0 xs = return acc
takeM' acc n [] = return acc
takeM' acc n (x:xs) =
do rs <- x
let rs' = take n rs
n' = n - length rs'
takeM' (acc ++ rs') n' xs

and then your list of tires stolen can simply be:

takeM 20 (map stealWheelz locations)

With this code, the robot will only steal as many tires is as necessary, even if the list of coordinates is of infinite length. Because Haskell is lazy, and IO routines (and all imperative programming that are modelled as monads) are treated as lazy values, you can pass around tasks, with all their arguments bound--even in a strict monad like IO--, and not have to deal with a lot of the usual glue that goes into determining how many results to return.

Btw, that means you can actually pass around the value:

let stealSomeWheelz = map stealWheelz locations

and take results from that list of actions only as you need them. This sort of thing is doable, but *much* less elegant in languages that do not treat actions as if they were values. When you design a language to operate via side effects, then you have to deal with those side effects immediately, as you arise. That means you can't easily reason about it at a higher level, even when you're trying to do simple things like "take the first n results".

As an example, I can apply a filter function to those results, as such:

-- Police commissioner is in town tonight with his Escalade
let stealSomeWheelz = map stealWheelz locations
filter noCaddyWheelzTonight <$> stealSomeWheelz

I discovered this trick when I needed to sanitize the first n queries for a particular area of interest, given a statically-ranked listed of sources. In a more general setting, removing the glue from imperative programming means that I can focus more on what I'm doing, rather than what I'm programming. That's a good thing.

I'd love to see more gems like this. I'm sure someone else has discovered this already, it's in a library somewhere, and I'm just yakking the obvious. I have a feeling that the functional community (and the rest of the world by proxy) is only beginning to discover the value of treating imperative routines as if they were monadic values. Indeed, at the time of this writing, there's a video on HN concerning monads in Clojure.

Wednesday, March 31, 2010

Reviving Post-Modern Journalism

Or, why newspapers aren't making money anymore.

1997 was a different time. I remember seeing all the adults on the subway with their newspapers in hand. Back then, in New York, the newspaper you read said something about you. The regular Joes read the Daily News or the Post, depending on their political slant. These were entertaining tabloids that slightly enhanced the truth, and you could tell just by reading the headlines. The folks who considered themselves somewhat more educated would have in hand a copy of the New York Times or the Wall Street Journal. It was harder to detect bias in those papers, but you could certainly discern it if you knew the raw facts and the spectrum of viewpoints.

Just as importantly, you could tell immediately which newspaper was which, just by looking at them. In the case of the plebeian pubs, the headlines were typed with screaming 2-inch high font. The patrician pubs used starkly different and recognizable typefaces for their titles.

Today, most newspapers in America are going out of business, and one thing that's perhaps been overlooked was the status conferred by reading a newspaper. One of the oldest adages in marketing states that the magnitude of status or identity conferred by a product determines how many brands you'll have of it. There are two major brands of toothpaste, as far as I can tell. People don't take too much pride in their toothpaste. You can't tell if someone brushed with Crest or Colgate, only whether they did at all. On the other hand, there's dozens of different car brands. Cars are expensive. They signal immediately one's level of wealth and some brands alert others to the owner's hobbies, or the size of the owner's beer gut or phallic insecurities.


No one really read the news just for the sake of knowing.

So how does this relate to newspapers?… It's quite simple: What if they don't confer status or identity anymore?

There's more than one take on this. My middle school math team teacher, a fairly well educated and openly intellectual person, once lamented how there were more newspapers and viewpoints in his day. He was born just before America suburbanized in the 50s, 60s and 70s. Sprawl was the first thing that happened. People may only see the newspaper on your lawn for a second in the morning, if at all. Next came computers, phones and tablets.

The problem with the modern world is that it's just not possible for others to see what you are reading anymore. What are they going to do? Squint at your laptop to see what you're reading? Forget about trying to surreptitiously read off an iPhone. You could argue that it can be done, but what's important is that people don't expect those around them to do so. Most damningly, most news in a paper is from the AP or Reuters anyway. It's just a piece of text and it really doesn't matter then where you get it from.

The other thing that happened is that the newspaper has stopped being a marker for appearing intelligent and well-informed. Old folks read newspapers. The well-informed folks get their news off the wire, on their smartphone, 30 seconds after Reuters or the AP pushed it out. What's post-modern journalism to do in the face of this?


Journalism needs to be "interesting" again, and other silly proposals.

I think that whatever journalism survives this new age will have to branch out again. It needs to once again appeal to people's inclinations and identity. Perhaps that's why blogs are so popular these days. They provide "analysis" alongside content. There are some news sources that seem to understand this. Stratfor, for example, provides a well-reasoned realpolitik attitude towards world news, and combines it with a solid understanding of local history and politics. It should be emphasized that unless you're aiming to be the 800lb gorilla like Fox News, your core readers should be willing to pay for your news. Everyone's talked about how the Internet has allowed companies to exploit niche markets. Perhaps it actually requires it, in our new world of commoditized information. Specialize. That's only part of the puzzle though.

The next part of the puzzle may be difficult for old media to stomach. They need to let subscribers share their material. If buying a newspaper was a way to confer status, then they can revive the dynamic in the modern world by letting subscribers mail a limited number of pieces to an unlimited number of their friends. Chances are, these friends aren't going to read the piece at all. What matters though is that subscriber will pay for a news service that tells others about how they think the world should be seen.

Sharing would also be a way for a subscription service to spread virally. People should also be able to post links to these articles on some social network, where they would be able to give props for them. People get a buzz out of seeing that others agree with their viewpoint, or found an interest in something that they read off the wire.

I actually think there's room on the web for a news service tailored exactly for this purpose (or maybe even more), especially if such a service were to aggregate based on people's interests and viewpoints. Implementing such a thing would of course be no trivial matter, but seeing how the incumbents seem to have dropped the ball, the time seems right. It could be monetized as well. The service could form a symbiosis with the news media by helping them sell subscriptions.