What are Discriminated Unions?

Consider that all types in programming languages represent sets of possible values. For example, bool is a set that has two possible values: true and false; uint represents the set of natural numbers from 0 to UInt32.MaxValue; when you use a generic parameter T in a type, then T represents every possible value in all types (unless its refined by a constraint clause - in fact, constraints could be seen as a 'filter' operation on the set of all values).

And so, if types are sets, then what happens if we create a union of sets?

Let's say we have the following enum:

    public enum Colour
    {
        Red,
        Green,
        Blue
    }

And then create a union of Colour and bool. Then the set of possible values are:

    true
    false
    Red
    Green
    Blue

But what happens if we create a union between bool and bool? Well, we just get:

   true
   false

Because duplicate items become one in a union operation. However, if we tag each set (each type) with an additional unique value then we get what's called a Discriminated Union (also known as: tagged union, variant, variant record, choice-type, disjoint union, sum-type or co-product). The easiest way to show that would be to use a tuple, with the first element being the tag and the second element being the value of the type. So, if we try to create a union of Colour and bool then we get:

    (1, true)
    (1, false)
    (2, Red)
    (2, Green)
    (2, Blue)

The integers being the tag in this case.

If we try to union bool and bool, we get:

    (1, true)
    (1, false)
    (2, true)
    (2, false)

They are discriminated by their 'tag'.

In most functional-first programming languages discriminated-unions are a first-class concept. This is an example from Haskell:

data Maybe a = Just a 
             | Nothing

Here Just and Nothing are type constructing functions that construct a Maybe a (like Maybe<A> in C#), but we can think of them as the tag for each set of values the various cases represent.

data Example = First Bool 
             | Second Bool

This simple example above would look like this in the tuple form shown earlier in this article:

    (First, True)
    (First, False)
    (Second, True)
    (Second, False)

The total number of values that are represented by a discriminated union is the sum of all the possible values in the individual cases. Which is why they are often referred to as sum types.

In C# we don't have sum-types yet. And so, the easiest way to represent them is to create a base-class or interface, and then a finite set of derived types:

    public interface Maybe<A> 
    {
        public static Maybe<A> Just(A value) => new Just<A>(value);
        public static Maybe<A> Nothing() => new Nothing<A>();
    }
    public class Just<A> : Maybe<A>
    {
        public readonly A Value;

        public Just(A value) => 
            Value = value;
    }
    public class Nothing<A> : Maybe<A>
    {
    }

This is obviously less convenient than the terse declarations in Haskell, and so we can achieve something similar by using the LanguageExt.CodeGen:

    [Union]
    public interface Maybe<A>
    {
        Maybe<A> Just(A value);
        Maybe<A> Nothing();
    }

NOTE: The big caveat here is that C# can't do completeness checking when using union-types with C# pattern-matching. However, the code-gen generates a Match function that does allow completeness checking.

Another way, which is slightly less powerful than the code-gen solutionm is to use C# records (available from C# 9):

    public abstract record Maybe<A>();
    public record Just<A>(A value) : Maybe<A>();
    public record Nothing<A>() : Maybe<A>();

This is also nice and terse, but also suffers from the lack of completeness checking, but you could add your own Match function:

    public abstract record Maybe<A>()
    {
        public B Match<B>(Func<Just<A>, B> Just, Func<Nothing<A>, B> Nothing) =>
            this switch
            {
                Just<A> j    => Just(j),
                Nothing<A> j => Nothing(j),
                _            => throw new NotSupportedException()
            };
    }

So, what are they useful for? Well, the key idea is that they represent a finite set of states, and the mapping of that finite set to another (via pattern matching) is more powerful and easier to reason about than the abstract/extend everything approach of the OO-world. Most programming is actually working on more concrete concepts than we're lead to believe by the OOP purists, and so creating a data model that more closely represents the number of states your application can be in is a good thing.

The above is true most of the time, but the reader should familiarise her or himself with The Expression Problem

Wiki Home

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What are Discriminated Unions?

Clone this wiki locally