Featured in Haskell Weekly issue 283
Sometimes, Haskell’s type system seems a bit restrictive compared to dynamic languages like Python. The most obvious example is the heterogenous list:
>>> # Python
>>> mylist = ["hello", "world", 117, None]
>>>
>>> for item in mylist:
print(item)
...
hello
world117
None
but in Haskell, list items must be of the same type:
-- Haskell
= ["hello", "world", 117, ()] -- Invalid: type cannot be inferred! mylist
This is a contrived example, of course. But consider this use-case: I just want to print the content of the list. It’s unfortunate I can’t write:
mylist :: Show a => [a]
= ["hello", "world", 117, ()] -- All these types have Show instances, but this won't compile mylist
For this specific application, the type system is overly restrictive – as long as all I want to do is print the content of my list! In this post, I’ll show you how to do something like this using the ExistentialQuantification
language extension.
A more complex example
Let’s say I want to list American football players. There are two broad classes of players (offensive and defensive) and we want to keep track of the players in a list – the player registry. Our final objective is to print the list of players to standard output.
Let’s try to do the same in Haskell. Our first reflex might be to use a sum type:
data Player = OffensivePlayer String String -- name and position
| DefensivePlayer String String -- name and position
playerRegistry :: [Player]
= ... playerRegistry
However, not all sports stats apply to OffensivePlayer
and DefensivePlayer
constructors. For example:
passingAccuracy :: Player -> IO Double
OffensivePlayer name pos) = lookupFromDatabase "passingAccuracy" name
passingAccuracy (DefensivePlayer name pos) = return 0 -- Defensive players don't pass
passingAccuracy (
tacklesPerGame :: Player -> IO Double
OffensivePlayer name pos) = return 0 -- Offensive players don't tackle
tacklesPerGame (DefensivePlayer name pos) = lookupFromDatabase "tacklesPerGame" name tacklesPerGame (
The Player
type is too general; we’re not using the type system to its full potential. It’s much more representative of our situation to use two separate types:
data OffensivePlayer = OffensivePlayer String String
data DefensivePlayer = DefensivePlayer String String
passingAccuracy :: OffensivePlayer -> IO Double
= ...
passingAccuracy
tacklesPerGame :: DefensivePlayer -> IO Double
= ... tacklesPerGame
This is much safer and appropriate. Now let’s give ourselves the ability to print players:
instance Show OffensivePlayer where
show (OffensivePlayer name pos) = mconcat ["< ", name, " : ", pos, " >"]
instance Show DefensivePlayer where
show (DefensivePlayer name pos) = mconcat ["< ", name, " : ", pos, " >"]
Awesome. One last problem:
-- This won't typecheck
= [ OffensivePlayer "Tom Brady" "Quarterback"
playerRegistry DefensivePlayer "Michael Strahan" "Defensive end"
,
]
printPlayerList :: IO ()
= forM_ playerRegistry print -- `forM_` from Control.Monad printPlayerList
Rather annoying. We could wrap the two player types in a sum type:
data Player = OP OffensivePlayer
| DP DefensivePlayer
instance Show Player where
show (OP p) = show p
show (DP p) = show p
playerRegistry :: [Player]
= [ OP (OffensivePlayer "Tom Brady" "Quarterback")
playerRegistry DP (DefensivePlayer "Michael Strahan" "Defensive end")
,
]
printPlayerList :: IO ()
= forM_ playerRegistry print printPlayerList
but this is quite clunky. It also doesn’t scale well to cases where we have a lot more types!
Enter existential quantification
The latest version of the Haskell language (Haskell 2010) is somewhat dated at this point. However, the Glasgow Haskell Compiler supports language extensions at the cost of portability. Turns out that the ExistentialQuantification
language extension can help us with this problem.
We turn on the extension at the top of our module:
{-# LANGUAGE ExistentialQuantification #-}
and create an existential datatype:
data ShowPlayer = forall a. Show a
=> ShowPlayer a
The datatype ShowPlayer
is a real datatype that bundles any data a
which can be shown. Note that everything else about the internal type is forgotten, since the ShowPlayer
type wraps any type that can be shown (that’s what forall a. Show a
means).
We can facilitate the construction of a Player
with the following helper function:
mkPlayer :: Show a => a -> ShowPlayer
= ShowPlayer a show mkPlayer a
Now since the data bundled in a ShowPlayer
can be shown, the only operation supported by ShowPlayer
is Show
:
instance Show ShowPlayer where
show (ShowPlayer a) = show a
Finally, our heterogenous list:
playerRegistry :: [ShowPlayer]
= [ -- ✓ OffensivePlayer has a Show instance ✓
playerRegistry ShowPlayer (OffensivePlayer "Tom Brady" "Quarterback"))
-- ✓ DefensivePlayer has a Show instance ✓
ShowPlayer (DefensivePlayer "Michael Strahan" "Defensive end"))
,
]
printPlayerList :: IO ()
= forM_ playerRegistry print printPlayerList
So we can have an heterogenous list – as long as the only thing we can do with it is show it!
The advantage here compared to the sum-type approach is when we extend our code to many more types:
data Quarterback = Quarterback String deriving Show
data Lineman = Lineman String deriving Show
data Runningback = Runningback String deriving Show
data WideReceiver = WideReceiver String deriving Show
data DefensiveEnd = DefensiveEnd String deriving Show
data Linebacker = Linebacker String deriving Show
data Safety = Safety String deriving Show
data Corner = Corner String deriving Show
-- Example: some functions are specific to certain positions
passingAccuracy :: Quarterback -> IO Double
= ...
assingAccuracy
playerRegistry :: [ShowPlayer]
= [ mkPlayer (Quarterback "Tom Brady"))
playerRegistry DefensiveEnd "Michael Strahan"))
, mkPlayer (Safety "Richard Sherman"))
, mkPlayer (...
,
]
printPlayerList :: IO ()
= forM_ playerRegistry print printPlayerList
This way, we can keep the benefits of the type system when we want it, but also give ourselves some flexibility when we need it. This is actually similar to object-oriented programming, where classes bundle data and operations on them into an object!
A bit more functionality
Let’s pack in more operations on our heterogenous list. We might want to not only show players, but also access their salaries. We describe the functionality common to all players in a typeclass called BasePlayer
:
class Show p => BasePlayer p where
-- Operate in IO because of database access, for example
getYearlySalary :: p -> IO Double
instance BasePlayer Quarterback where
...
instance BasePlayer Lineman where
...
We can update our player registry to support the same operations as BasePlayer
through the Player
existential type:
data Player = forall a. BasePlayer a
=> Player a
instance Show Player where
show (Player a) = show a
instance BasePlayer Player where
Player a) = getYearlySalary a getYearlySalary (
and our new heterogenous list now supports:
playerRegistry :: [Player]
= [ Player (Quarterback "Tom Brady")
playerRegistry Player (DefensiveEnd "Michael Strahan")
, Player (Safety "Richard Sherman")
, ...
,
]
printPlayerList :: IO ()
= forM_ playerRegistry print -- unchanged
printPlayerList
average_salary :: IO Double
= do
average_salary <- for playerRegistry getYearlySalary -- (`for` from Data.Traversable)
salaries return $ (sum salaries) / (length salaries)
So we can have a heterogenous list – but we can only perform operations which are supported by the Player
type. In this sense, the Player
type encodes our intent.
Conclusion
In this post, we’ve seen how to create heterogenous lists in Haskell. However, contrary to dynamic languages, we can only do so provided we are explicit about our intent. That means we get the safety of strong, static types with some added flexibility if we so choose.
If you’re interested in type-level programming, including but not limited to the content of this present post, I strongly recommend Rebecca Skinner’s An Introduction to Type Level Programming
Thanks to Brandon Chinn for some explanation on how to simplify existential types.