# An Introduction to Monads for C++ Programmers

*2015/09/11*

This website would not be complete without a monad tutorial, so I will use this occasion to write one. This tutorial will be using C++ for a spin, since others have already written tutorials for other programming languages. I find that reasoning about monads in C++ is relatively hard, so this may not be the best way to learn about the concept for the first time; it is more of an exercise to see what monads in C++ could look like. Where appropriate, I will use Haskell-style pseudo-code for clarification.

As a little disclaimer, please bare in mind that what follows may not be completely accurate or complete. For example, at some point I will provide the definitions of return and bind for the Maybe monad and say that the monad definition is complete, without a single mention of functors or applicatives. The goal here is to keep the discussion as simple and focused as possible. For further insights into these concepts, you can find plenty of other resources online.

## Why monads?

Explaining monads with no prior motivation as to why we need them would be an exercise in futility. In a purely functional programming language like Haskell, functions are said to be pure. A pure function is a function in the mathematical sense: it maps an input to an output, *and does nothing else*. Pure functions are *referentially transparent*: replacing a function with its expression leaves the program's semantics unaltered. Given the same input, a pure function always yields the same output, no matter how many times the function is applied to that input. In addition, pure functions have no side effects: they do not write to the console, talk over the network, or read/write any kind of global, mutable state. Consequently, a compiler has the freedom to evaluate independent computations in whatever order it feels like.

In many situations, however, we need order: read X, then evaluate f(X), and only then write the result to the console. Bringing order to chaos and performing side effects in a specific context is exactly what monads allow us to do.

## What is a Monad?

Intuitively, a monad can be defined as a computation mechanism that allows us to sequence actions in a context. The key words here are *sequence* and *context*.

### Sequence

Monads help us solve the problem of running computations in a specific order. That is exactly what we mean by sequence: do x, then y, then z. As a C++ programmer, you should feel at home with this concept:

```
int x = 2+3;
int y = f(x);
std::cout << "y is " << y;
```

Here we have a sequence of statements. We tell the computer to compute `2+3`

and assign the result to `x`

, then assign `f(x)`

to `y`

, and then show the result on the console, all in that particular order. The concept of sequence in monads is exactly the same.

But why do we need anything special to run statements in a sequence?. After all, a C++ program *is*, essentially, a sequence of statements. Remember where we are coming from: a world in which functions are so well behaved (functions are pure) that the underlying machine has much freedom as to the order in which to evaluate them in. Monads allow us to determine that order.

When describing monads, we often use the word *monadic action* to refer to an action, computation, step or statement in a sequence of statements. We can use all of these terms interchangeably to refer to each of the statements in a sequence like the above code snippet. A monadic action is to monads what a function is to regular computations: the basic building block with which we build more complex computations.

### Context

The context is what allows side effects to take place, solving our second problem. Let's see what this means exactly by moving on to the definition of our first monad.

## The Maybe Monad

The `Maybe`

monad allows us to sequence actions in the context of failure. That is, an action in the `Maybe`

monad can either yield a value or result in failure. This allows us to express computations that may go wrong and fail, such as taking the head of an empty list.

In C++, we could start by defining the following `Maybe`

data type:

```
template <typename a>
struct Maybe
{
std::shared_ptr<a> value;
// Evaluate the monadic action and yield its value
const a& operator() () const
{
return *value;
}
};
```

The `Maybe`

struct is a little container that may or may not contain a value, depending on the state of its internal pointer. However, to fully understand monads, we should not think of `Maybe`

just as a container, but rather as a special kind of computational entity, as hinted by the addition of a function operator. The function operator returns the value being pointed to, assuming that it exists. This gives `Maybe`

objects a new spin: they can be treated as something we can call or evaluate to yield a value. This is what we mean by "computational entity", or "function object" in plain C++ terms. In addition, `Maybe`

objects will have special denotations. A `Maybe<a>`

object containing a null pointer will denote a failed computation, whereas a `Maybe<a>`

object with an initialised pointer will denote a successful computation that yields a value of type `a`

. These are denotations that *we* give to `Maybe`

; they don't mean anything to a compiler. Think of `Maybe`

, then, as a special kind of function object or computational entity that may potentially yield a value depending on the state of its internal pointer.

At this point, we can introduce some new vocabulary. As discussed, a `Maybe`

value can be seen as a special kind of computational entity, and we just happen have a word for it. We say that a `Maybe`

object is a *monadic value* or *monadic action*. It is a value because it is really just another object in memory that we can store and pass around, like an `int`

or an `std::vector`

. It is an action because the `Maybe`

object is also a function object, so we can look at it as something we can evaluate by calling its function operator to yield a value.

To construct `Maybe`

values, we define two functions, `Nothing`

and `Just`

, which construct failed and successful computations, respectively:

```
// Construct a Nothing value, denoting failure
template <typename a>
Maybe<a> Nothing ()
{
return Maybe<a>(); // null pointer, empty value / failed computation
}
// Construct a Just value, denoting a successful computation
template <typename a>
Maybe<a> Just (const a& x)
{
Maybe<a> maybe;
maybe.value.reset(new a(x));
return maybe;
}
```

The function names `Nothing`

and `Just`

help us with readability. The `Nothing`

function constructs *empty* `Maybe`

values, which stand for failed computations, hence the word `Nothing`

. The `Just`

function, on the other hand, constructs `Maybe`

values that contain an internal value. These non-empty `Maybe`

values represent successful computations that *just* yield the value we pass to the `Just`

function, hence the name `Just`

. The functions `Nothing`

and `Just`

therefore construct failed and successful `Maybe`

computations, respectively.

I defined the `Nothing`

and `Just`

functions instead of adding two constructors to the `Maybe`

struct because we cannot give constructors a name, and I really wanted these constructor functions to have a proper name. The reason is that we can now perform an exercise that is very common in functional programming: we will refer to values constructed with the `Nothing`

function as * Nothing values*, and to values constructed with the

`Just`

function as *. Thus, the distinction between the functions and the values they yield becomes very blurry; this is functional programming.*

`Just`

valuesFrom this point on, when we talk about `Nothing`

or `Just`

, it will not matter whether we are referring to the functions or to values constructed with those functions. The expression `Just(3)`

can be regarded as a function call to the `Just`

function with value `3`

, or as a `Maybe`

value that yields a `3`

(the `Just`

function returns a `Maybe`

value). Similarly, the expression `Nothing()`

can be viewed as a function call to the `Nothing`

function, or as a `Maybe`

value denoting a failed computation. Function and value are equal.

Let us wrap up by exercising our new vocabulary. A `Nothing`

value, named after the `Nothing`

function, represents a failed computation. A `Just`

value, named after the `Just`

function, represents a successful computation. We say that a `Maybe<a>`

value represents a computation that can either yield `Nothing`

or `Just<a>`

value, hence the words `Maybe`

, `Just`

and `Nothing`

in the definitions above. A `Maybe`

value can also be called a monadic value or monadic action, depending on how we want to think about it.

Are you still thinking of `Maybe`

as a simple container at this point, or is it starting to look a bit more magical? In a few moments we will see how `Nothing`

and `Just`

work as failed and successful computations.

Having defined the `Maybe`

struct together with the `Nothing`

and `Just`

functions, we can now create some example monadic values or actions:

```
Maybe<int> x = Just(3); // A successful computation that yields value 3
Maybe<int> y = Nothing(); // A failed computation
```

So far we have defined what the `Maybe`

monad looks like, but we have not defined how it behaves. This behaviour is what will shed light on the real meaning of `Nothing`

and `Just`

. For a monad definition to be complete, we must define two basic operations.

### return

The `return`

function takes a value of type `a`

and yields a monadic value `Maybe<a>`

:

```
template <typename a>
Maybe<a> ret (const a& x) // I am calling the function `ret` because
{ // `return` is a special keyword in C++.
return Just<a>(x);
}
```

If we look at `Maybe`

as a container, we can say that `return`

takes a simple value and puts it in a `Maybe`

. This is like taking a value and using `push_back`

to add it to an `std::vector`

. But we should not think of `Maybe`

as just another container.

If we look at `Maybe`

as a special kind of computational entity, then the `return`

function takes a simple value and wraps it in the `Maybe`

monad. The `return`

function answers the question - *"What is the simplest computation that this monad can express?"* - In the particular case of the `Maybe`

monad, the simplest computation is one that takes a value `a`

and yields `Just<a>`

value.

Another way of seeing the `return`

function is as a function that takes a simple value and puts it in the context of the `Maybe`

monad. Remember when we said that monads allow us to sequence actions in a context? The `return`

function allows us to take a simple value and lift it into the context of the `Maybe`

monad. It transforms a value of type `a`

into a monadic value or monadic action `Maybe<a>`

that yields that value when evaluated (its function operator is called on).

You might have noticed that `return`

is just `Just`

with another name. The definition of `return`

for the `Maybe`

monad *just* happens to be that simple :)

An example of `return`

in action:

```
Maybe<int> x = ret(3); // Just<int>(3)
```

### bind

Context and sequence, those are the two key words in the definition of a monad. The `return`

function allows us to put values into context. Another function, `bind`

, allows us to sequence actions in that context.

The `bind`

function takes a `Maybe<a>`

and a function from `a`

to `Maybe<b>`

, and then applies the function to the `a`

value contained in the `Maybe<a>`

to yield a `Maybe<b>`

:

```
template <typename a, typename b>
Maybe<b> bind (const Maybe<a>& x,
const std::function<Maybe<b>(const a&)>& f)
{
if (is_nothing(x))
return Nothing<b>();
else
return f(x());
}
```

where

```
// Return true if the Maybe value is Nothing, false if it is Just a value.
template <typename a>
bool is_nothing (const Maybe<a>& x)
{
return x.value.get() == nullptr;
}
```

Note that if the input `Maybe<a>`

is `Nothing`

, then there is nothing to apply the function to (the internal pointer is null) and `bind`

simply yields `Nothing`

. If, on the other hand, the input `Maybe<a>`

value is `Just<a>`

value (the internal pointer actually points to something), then `bind`

applies the function to the `a`

value inside of it, yielding a `Maybe<b>`

value. In Haskell:

```
bind :: Maybe a -> (a -> Maybe b) -> Maybe b
bind Nothing f = Nothing
bind (Just x) f = f x
```

This completes the definition of the `Maybe`

monad.

## Visualising the Maybe Monad

At this point, you might be asking yourself what have we actually done. You understand the `Maybe`

struct and the `Nothing`

and `Just`

functions, as well as `return`

and `bind`

. But you still do not see how a `Maybe`

value is supposed to represent any kind of computation beyond a function object we can call, or what role the `return`

and `bind`

functions play at all. That is fine. What I wanted to illustrate is how relatively simple the definition of the `Maybe`

monad is. The monad's true power manifests itself when we put it to use.

Recall the definition of the `bind`

function:

```
template <typename a, typename b>
Maybe<b> bind (const Maybe<a>& x,
const std::function<Maybe<b>(const a&)>& f)
{
if (is_nothing(x))
return Nothing<b>();
else
return f(x());
}
```

Binding `Nothing`

to a function yields `Nothing`

. Binding `Just<a>`

value to a function yields the result of applying that function to the `a`

contained in the `Just<a>`

, which is itself another monadic value, now of type `Just<b>`

. In other words, if the input `Maybe<a>`

is a failed computation, or `Nothing`

, `bind`

propagates the failure and yields `Nothing`

. In contrast, if the input `Maybe<a>`

is a successful computation, or *" Just<a> value"*, then

`bind`

runs the next step in the sequence by applying the function to the `a`

in the `Just<a>`

to yield a `Just<b>`

.And now we can repeat the process. The `Just<b>`

value can itself be passed to `bind`

again together with another function to yield a `Just<c>`

value. And the `Just<c>`

can then again be transformed with `bind`

into `Just<d>`

. Or maybe this last computation fails, yielding `Nothing`

, in which case binding this failed `Maybe<d>`

value to the next step in the computation will also yield `Nothing`

. And the `Nothing`

will propagate down the chain to make the entire sequence of monadic actions yield `Nothing`

, denoting that the overall computation has failed. If one action in the sequence fails, the entire sequence fails. If all of the actions are successful, the sequence yields the last value in the chain. Welcome aboard, we have just defined computations under the context of failure. Let's illustrate the `Maybe`

monad in action with some examples.

### Defining functions in the Maybe monad

Consider a function to divide an integer `x`

by an integer `y`

:

```
int div (int x, int y)
{
if (y == 0)
// what here?
else
return x/y;
}
```

Division is defined for all pairs of integers except when the denominator is 0. What is the `div`

function supposed to yield in that case? As a C++ programmer, you might opt for throwing an exception. That is perfectly fine. In fact, exceptions are just another monad, and the entire C++ language itself can be seen as one big monad. Since we want to illustrate how the `Maybe`

monad works, however, let us represent failure using `Maybe`

:

```
Maybe<int> div (int x, int y)
{
if (y == 0)
return Nothing<int>(); // Nothing denotes failure
else
return Just(x/y); // Just denotes success
}
```

Notice how the `div`

function now yields a value of type `Maybe<int>`

instead of just a plain `int`

. When we do this, we say we that have *lifted* the function into the `Maybe`

monad. If the denominator is 0, the `div`

function yields `Nothing`

, otherwise it yields `Just`

the result of the division `x/y`

. Even though division is not defined for every pair of integers, the `Maybe`

monad does allow us to write `div`

as a total function. The `Maybe`

monad allows us to handle the failure case when `y=0`

.

Finally, let us define a function that increments a number by one, just for illustration purposes:

```
Maybe<int> inc (int x)
{
return Just(x+1);
}
```

The `inc`

function always returns a `Just`

value, since there is no failure case here. This is perfectly valid; we never said that actions in the `Maybe`

monad *have to* fail.

Having these two functions in hand, let us see how we can thread or sequence computations in the `Maybe`

monad together with `bind`

.

### Sequencing computations with bind

Remember that `bind`

takes a `Maybe<a>`

value, a function from `a`

to `Maybe<b>`

, and returns a `Maybe<b>`

:

```
template <typename a, typename b>
Maybe<b> bind (const Maybe<a>& x,
const std::function<Maybe<b>(const a&)>& f)
{
if (is_nothing(x))
return Nothing<b>();
else
return f(x());
}
```

A value of type `Maybe<a>`

represents a computation that may or may not yield a value of type `a`

. The `bind`

function allows us to thread computations by taking the `a`

wrapped in these `Maybe<a>`

values and applying a function to it to yield a new monadic value of type `Maybe<b>`

. This new `Maybe`

value can in turn be fed as an input to the next computation, and the result of that computation to the next one, and so on. But remember that monads sequence actions *in a context*. The context in the case of the `Maybe`

monad is failure. If `bind`

comes across a `Nothing`

- a failed computation - then the entire sequence of actions results in `Nothing`

, or failure. If everything goes well, `bind`

yields the expected result wrapped in a `Just`

.

The following examples use bind to feed `Maybe int`

values to functions taking plain `int`

values and yielding `Maybe int`

values:

```
Maybe<int> a = bind(ret(2), inc); // Just(3)
Maybe<int> b = bind(bind(ret(2), inc), inc); // Just(4)
Maybe<int> c = bind(ret(4), std::bind(div, _1, 2)); // Just(2)
```

In the first example, we take value `2`

and lift it into the monad with `ret`

. This results in a `Just(2)`

value, a computation yielding a `2`

. This `Just(2)`

value is fed to `bind`

together with `inc`

. The result is that the `2`

in the `Just(2)`

is incremented, resulting in a `Just(3)`

. To see why this happens, use the definitions of `ret`

, `bind`

and `inc`

and substitute them in:

```
bind (ret(2), inc) = bind (Just(2), inc) = inc(2) = Just(3)
```

In the second example, we inject `2`

into the monad to obtain `Just(2)`

. The `Just(2)`

value is fed to `bind`

together with `inc`

. This yields `Just(3)`

. The value `3`

contained in the `Just(3)`

is once again fed to `bind`

with `inc`

, yielding a final result of `Just(4)`

. Again, apply substitution to see how this works:

```
bind ( bind (ret(2), inc), inc )
= bind ( bind (Just(2), inc), inc )
= bind (inc(2), inc)
= bind (Just(3), inc)
= inc(3)
= Just(4)
```

The third example feeds the value `4`

wrapped in a `Just`

to a function that divides its argument by `2`

, yielding `Just(4/2`

) = `Just(2)`

.

### Sequencing failure

So far so good, but none of the above examples have a failure case. What makes the `Maybe`

monad interesting is that it allows us to deal with failure, so let us see how `bind`

sequences failed computations.

Consider a function that computes `(1/x) + 1`

:

```
int compute (int x)
{
return (1/x) + 1;
}
```

The `compute`

function is undefined for `x = 0`

, so it is unsafe in this form. We can lift the `compute`

function into the `Maybe`

monad to make it safe:

```
Maybe<int> compute (int x)
{
std::function<Maybe<int>(const int&)> div_by_x = std::bind(div, _1, x);
return bind(bind(ret(1), div_by_x), inc);
}
```

Using the monadic version of `compute`

, we can now apply this function to integer values with no fear of dividing by 0:

```
Maybe<int> d = compute(1); // Just(2)
Maybe<int> e = compute(0); // Nothing
```

The first example computes `(1/1) + 1 = Just(2)`

. What is more interesting is the second example, which attempts to compute `(1/0) + 1`

. The `Maybe`

monad detects the erroneous division by 0 and yields `Nothing`

in return, denoting that the computation has failed. The value 1 is never divided by 0, so the program does not crash. `Maybe`

handles the failure gracefully and yields `Nothing`

.

To understand how each of these computations work, apply substitutions as usual:

```
compute 1
= bind (bind (ret(1), div_by_1), inc)
= bind (bind (Just(1), div_by_1), inc)
= bind (div_by_1(1), inc)
= bind (Just(1), inc)
= inc(1)
= Just(1)
```

and

```
compute 0
= bind (bind (ret(1), div_by_0), inc)
= bind (bind (Just(1), div_by_0), inc)
= bind (div_by_0(1), inc)
= bind (Nothing, inc)
= Nothing
```

The first example is just like the ones in the previous section, in the sense that all computations produce `Just`

values and the final result is the expected value of `(1/x) + 1`

wrapped in a `Just`

.

The second example is where the `Maybe`

monad manifests its power. When we evaluate `compute(0)`

, everything goes well until we attempt to divide 1 by 0:

```
bind (div_by_0(1), inc)
```

The `div`

function yields `Nothing`

in this case:

```
Maybe<int> div (int x, int y) // x = 1, y = 0
{
if (y == 0) // true
return Nothing<int>(); // failure
else
return Just(x/y);
}
```

This leaves us with

```
bind (Nothing, inc)
```

Now recall the definition of `bind`

:

```
template <typename a, typename b>
Maybe<b> bind (const Maybe<a>& x,
const std::function<Maybe<b>(const a&)>& f) // x = Nothing
{
if (is_nothing(x)) // true
return Nothing<b>(); // propagate failure
else
return f(x());
}
```

`bind`

finds `Nothing`

in its input and yields `Nothing`

, propagating the failure down the monadic chain:

```
bind (Nothing, inc)
= Nothing
```

The final result is therefore `Nothing`

. One simply cannot divide by 0.

### The Side Effects of Maybe

We have said that monads help us solve two problems: determining order and allowing side effects. We have seen how `bind`

solves the order problem by threading or sequencing monadic actions, running one after the other. But at what point were our `Maybe`

actions allowed to have side effects?

A `Maybe`

action is allowed to do two things: yield `Just<a>`

value or `Nothing`

. When a `Maybe`

action results in `Nothing`

, the entire sequence of actions is short-circuited and the overall result is `Nothing`

. This is the side effect that a `Maybe`

action can have. A `Maybe`

action can yield a value just like regular functions, or it can fail the entire computation and make it result in failure. This second nature of monads is what gives them their side effects. In the case of the `Maybe`

monad, the side effect is failure. The `Maybe`

monad therefore allows us to deal with failure.

## Aftermath

A monad allows us to sequence actions in a context. The context is what makes one monad different from another. The `Maybe`

monad puts us in the context of failure: a `Maybe`

action can either yield a value or result in failure. This second nature of monads, being able to do something other than computing values, is what allows monadic actions to have side effects. Monads also answer the question of what order to run computations in.

A C++ programmer would typically resolve to exceptions or some other way of signaling failure for the examples presented above. The `Maybe`

monad is less powerful than an exception, because while the `Maybe`

monad will allow you to represent computations that may fail, it will not tell you *why* the failure occurred. Exceptions, on the other hand, often contain a message describing the problem encountered. It turns out, however, that exceptions in Haskell are just another Monad (called `Either`

, because they can yield *either* a meaningful value or a descriptive error). The `Maybe`

monad is just one of the many monads that exist, simple enough to be able to write a short tutorial about it. But there are many others, such as the `Either`

monad for exceptions, or the `List`

monad, which describes non-deterministic computations.

The C++ language can be seen as one specific monad. You get to describe your programs in a given way, and that is it. In Haskell, on the other hand, there are many monads, and these can be composed (stacked) to combine their effects. For example, if we wish to describe non-deterministic computations than can fail and talk over the network, we can do just that. Thus, in Haskell we often find ourselves first deciding how we want to represent the solutions to a given problem domain, come up with the right monad, and then actually solve the problem. More generally, we come up with the right abstractions for each problem domain, which may or may not involve monads.

Monads are also often used to limit the side effects a computation may have. We may want write code that can send or receive messages over the network, for example, and be able to do *only* that. By defining a monad to deal with the specifics of the network, we make sure our network code is not having other side effects, such as writing changes to a database. This makes reasoning about our code much easier, since the monad ensures that the code is doing only the kinds of things it was intended to do.

## Moving Forward

If you would like to learn further about these concepts, consider learning you a haskell or reading the Haskell book.

## Source Code

The C++ code below contains everything we have seen above. It is a little different from the examples, but it should not deviate too much. More specifically, `bind`

was ambiguous with `std::bind`

, so I renamed it to `mbind`

, and the compiler was also having trouble matching certain types and deducing others.

### C++

### Haskell

```
data Maybe a = Just a | Nothing
return :: a -> Maybe a
return = Just
bind :: Maybe a -> (a -> Maybe b) -> Maybe b
bind Nothing _ = Nothing
bind (Just x) f = f x
mydiv :: Int -> Int -> Maybe Int
mydiv _ 0 = Nothing
mydiv x y = Just (x `div` y)
inc :: Int -> Maybe Int
inc = return . (+1)
compute :: Int -> Maybe Int
compute x = Just x >>= \y -> (mydiv 1 y >>= inc)
```