A Month of Haskell, Day 5 - Applicative, Alternative, and Functor
And now I’m two days behind. If I can think of something sufficiently quick, I’ll have a bonus day where I do two posts. For now, you’re just going to have to enjoy this one single post about three complementary type classes: Applicative, Alternative, and Functor. The first two are in the Control.Applicative module, while the last is in the Data.Functor module. I’m going to ignore the jargon-heavy definitions of these type classes and skip right to showing how they can be used.
Both Control.Applicative and Data.Functor are part of the base system, so there’s nothing you need to install.
Prior to version 4.8.0 of the base system, a lot of the functions in Applicative were not re-exported as part of the prelude. To use them, you had to import them. If you care about supporting both older and newer versions of GHC, you can use the CPP language extension. I’m going to cover that in greater depth in a future post, but here’s the gist of it:
#if !MIN_VERSION_base(4,8,0)
import Control.Applicative((<$>))
#endifAs you can see, it looks a lot like CPP in any other language.
I think it makes more sense to talk about this module in terms of what you can do with it, rather than just running through definitions. The useful functions will come up as we go along.
Functors
We’re going to use the <$> function quite a bit, which is part of the functor module but
is re-exported by Applicative. These two type classes are all tangled up with each other,
so it’s worth covering functors briefly first.
Type classes:
class Functor f where
fmap :: (a -> b) -> f a -> f b
(<$) :: a -> f b -> f a
class Functor f => Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
(*>) :: f a -> f b -> f b
(<*) :: f a -> f b -> f aAs with a lot of things in Haskell, “functor” is a fancy word for a really simple concept.
It’s just a type class that provides the fmap function. It includes some other things,
but they’re less important. A functor is anything that can be mapped over, and the
fmap function is how you do that.
Type signatures:
fmap :: (a -> b) -> f a -> f bA list is the most basic and most obvious example of a functor:
ghci> fmap (+1) [1, 2, 3]
[2,3,4]
Plenty of other types can be mapped over, but in a more abstract way. I prefer to think of it as a functor is a type you can reach inside of and apply a function to. So, Maybe is a functor:
ghci> fmap (* 2) (Just 4)
Just 8
ghci> fmap (* 2) Nothing
Nothing
IO is a functor, too:
ghci> :m +System.Directory Data.Char
System.Directory Data.Char> fmap (map toUpper) getHomeDirectory
"/HOME/CLUMENS"
An awful lot of things are functors. And anywhere you use fmap, you can use <$>
instead. I suggest only doing so when an operator would make code look more natural.
That last example could be written like this instead:
ghci> :m +System.Directory Data.Char
System.Directory Data.Char> map toUpper <$> getHomeDirectory
"/HOME/CLUMENS"
Eliminating intermediate variables
Here’s one way you could get the time and add a colon between the hour and minutes:
import Data.Time.Clock(UTCTime(..), getCurrentTime)
import Data.Time.Format(defaultTimeLocale, formatTime)
format :: String -> UTCTime -> String
format fmt time = formatTime defaultTimeLocale fmt time
theTime :: IO String
theTime = do
time <- getCurrentTime
return $ format "%R" time
addColon :: String -> String
addColon [h1, h2, m1, m2] = [h1, h2, ':', m1, m2]
addColon s = s
main :: IO ()
main = do
time <- theTime
let time' = addColon time
putStrLn time'Aside from the fact that time-related code is awful everywhere, there’s nothing really
tricky here. addColon is pretty fragile - what if you change the format string and
return a time with seconds? But aside from that, there’s no real problem. Running it
gives you an answer you might expect:
$ runhaskell time.hs
20:39
There’s something I really hate about this code, though, and it happens in two spots.
theTime function has this intermediate time variable that we only need because
of the IO monad. If we were in an imperative language we wouldn’t need it - we could
just pass the result of getCurrentTime right into the formatting function. If we
try that in Haskell, however, we get:
time.hs:9:26: error:
• Couldn't match expected type ‘UTCTime’
with actual type ‘IO UTCTime’
• In the second argument of ‘format’, namely ‘getCurrentTime’
In the second argument of ‘($)’, namely
‘format "%R" getCurrentTime’
In the expression: return $ format "%R" getCurrentTime
The same thing happens in the main function, too. To me, these intermediate steps obscure what is actually happening in the code. It makes things needlessly wordy and makes it seem like there’s some deficiency with functional programming.
How could we get rid of those? We use the <$> function. Instead of thinking about
all that functor nonsense, I prefer to think about this function as being very
similar to $, but for different types. Squint hard enough and those brackets go
away and that’s what it is.
We can eliminate the intermediate variable in theTime in two different ways, depending
on if you like functions or operators more:
theTime2 :: IO String
theTime2 = fmap (format "%R") getCurrentTime
theTime3 :: IO String
theTime3 = format "%R" <$> getCurrentTimeBoth result in the exact same time string, and both eliminate the intermediate
variable. Of the two, I think I prefer the operator version this time. However,
there’s plenty of places for fmap. Then there’s the main function. That can
be shortened up like so:
main :: IO ()
main = do
time <- addColon <$> theTime
putStrLn timeHere, addColon is a function that doesn’t have anything to do with the IO monad
or any other monad at all. Its type is String -> String, but it somehow just
works in this case. That’s due to the type of <$> and the fact that IO is a
functor. We are reaching inside of the result of theTime and running addColon
on what’s inside. Maybe using fmap would be more obvious here, but I like the
pipeline style.
In general, anywhere you do something of this form:
v <- someFunction
someFunction2 vYou should think about using fmap and <$> to shorten things up. hlint will
remind you, if you forget. At the least, this can save you from having to think
up a lot of crazy temporary variable names just to throw them away on the next
line.
Here’s a variation on that theme. Consider this code:
dlg <- new Dialog []
box <- dialogGetContentArea dlg
set box [ #spacing := 12 ]
let s = T.concat ["<b>Duplicate QSO detected</b>\n\n",
"A QSO made with ", asText qCall, " at ", T.pack . colonifyTime $ qTime, " ",
T.pack . dashifyDate $ qDate, " on ", showt qFreq,
" is a potential duplicate."]
lbl <- new Label [ #label := s, #useMarkup := True ]
containerAdd box lbl
void $ dialogAddButton dlg "Cancel" 1
void $ dialogAddButton dlg "Log it" 0
widgetShowAll dlg
ret <- dialogRun dlg
widgetHide dlg
return retThis code uses a lot of advanced GTK stuff with haskell-gi, which I plan on going into in another post. However, it should be fairly easy to follow because it looks a lot like GTK code in C (or any other language, really). I’m leaving off the giant block of imports.
For this post, the most important thing is the last four lines. dialogRun blocks the
screen while the dialog is displayed and returns the value associated with whatever button
is pressed. This is pretty obnoxious code, though. We are using another temporary variable
here, but it doesn’t fit the previous pattern because of having to run widgetHide in
the middle.
Luckily, digging into the Applicative docs show a function that takes care of this problem.
The <* function (whose type signature was shown in the Applicative type class definition
above) runs what’s on its left side, runs what’s on its right side, and returns the value
from the left side. Any value from the right side is discarded. Rewriting the last four
lines look like this:
widgetShowAll dlg
dialogRun dlg <* widgetHide dlgThere’s also a *> function that is similar to <* but drops the value of the left side
and returns the value of the right side. I’ve not had much reason to use it, but it’s
there if you need it.
Alternatives
Sometimes, you want to try a couple different actions and take whichever one succeeds. That’s
where the Alternative type class comes in. It’s an Applicative that adds two more basic
functions, empty and the operator-like <|>. As you can see from the
instances,
not everything that is an Applicative is an Alternative, but at least lists, Maybe, and IO
are. That covers a lot.
Type classes:
class Applicative f => Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
some :: f a -> f [a]
many :: f a -> f [a]To see why Alternative is useful, look at this contrived example:
let x = Nothing
let y = Just 2
if isJust x then x
else
if isJust y then y
else Just 3Running this mess will give you Just 2. But imagine if there were a third or fourth or a whole
list of possibilities to check. That would be a lot of stairsteps, and any time you see stairsteps
you should start feeling like there’s a better way to do it.
That is exactly the point of <|>. The following would also return Just 2:
let x = Nothing
let y = Just 2
x <|> y <|> Just 3Because of Maybe’s Alternative definition, the first thing in the chain that returns a Just value will be the final result. Nothing later in the chain is evaluated. This can be very handy in real world applications:
import Control.Applicative((<|>))
import System.Directory(doesFileExist)
findConfigFile :: FilePath -> IO (Maybe FilePath)
findConfigFile fp = do
ret <- doesFileExist fp
if ret then return (Just fp) else return Nothing
main :: IO ()
main = do
cfg <- findConfigFile "/home/clumens/.foorc" <|>
findConfigFile "/usr/local/etc/foorc" <|>
findConfigFile "/etc/foorc"
print cfgIf none of these files exist, you’ll get Nothing. Otherwise, the first one that exists
will be the value of cfg. I have implemented my own wrapper around doesFileExist to
make this work, but it’s not doing anything special. It’s just there to return the right
type so the Applicative style works out.
Applicative Records
There’s plenty more things you can do with applicatives, but here’s just one more example for now. Let’s say we want to make a record that contains several things involving the IO monad:
import System.Directory(XdgDirectory(..), getHomeDirectory, getXdgDirectory)
import System.Posix.User(GroupEntry, getAllGroupEntries, getLoginName)
data User = User { homeDir :: FilePath,
xdgDir :: FilePath,
loginName :: String,
userGroups :: [GroupEntry],
ident :: Int }
main :: IO ()
main = do
homeDir <- getHomeDirectory
xdgDir <- getXdgDirectory XdgData ""
loginName <- getLoginName
userGroups <- getAllGroupEntries
let u = User { homeDir=homeDir,
xdgDir=xdgDir,
loginName=loginName,
userGroups=userGroups,
ident=0 }
return ()We’re back to temporary variables here, and it’s still annoying. Because the IO
monad is involved again, we have to make a temporary variable for each element of the
record to run the action, just to put it into the record and throw the variable name
away. You might be thinking there’s a better way to do it, and you’re right. With
the <*> function from Applicative, we can condense it like this:
import System.Directory(XdgDirectory(..), getHomeDirectory, getXdgDirectory)
import System.Posix.User(GroupEntry, getAllGroupEntries, getLoginName)
data User = User { homeDir :: FilePath,
xdgDir :: FilePath,
loginName :: String,
userGroups :: [GroupEntry],
ident :: Int }
main :: IO ()
main = do
u <- User <$> getHomeDirectory
<*> getXdgDirectory XdgData ""
<*> getLoginName
<*> getAllGroupEntries
<*> return 0
return ()Use <$> for the first element and <*> for all subsequent elements. If you are
interested in the details, the documentation is somewhat enlightening. Also, because
of the Applicative functions involved and the first four elements of the record
involving a monad, the last one needs to as well. That’s why you have to use return
to get the integer value into a monad.
So that’s about it for now. Functors come up all over the place. Applicatives come up in surprising places too. In the future, I hope to cover writing parsers in an applicative style.