I've been studying functional programming for a few years and in doing so I've learned a lot about types that I would not have otherwise. In particular, studying Haskell and related technology begets some interesting concepts. The one I want to talk about this time is "algebraic types".
From wikipedia:
In computer programming, especially functional programming and type theory, an algebraic data type is a kind of composite type, i.e., a type formed by combining other types.
If you've used TypeScript for a bit you've probably been exposed to some of these already.
Sum Types
Let's say we have a short list of commands that an API will support, "list" and "pop". We can use string literal types to represent the valid values:
type ListCommand = 'list'
type PopCommand = 'pop'
Each of these types only have one value--the specific string that is valid. If we want a function that can take either command we can compose these types together using the |
operator.
type ValidCommand = ListCommand | PopCommand
const performCommand = (command: ValidCommand) =>
The |
type operator creates a sum type of the types to the left and right of it. What you should notice is that the sum adds the potential values together, i.e. ValidCommand accepts ListCommand plus PopCommand. This may be more apparent with bigger types:
type CommandOrFlags = ValidCommand | Flags
You'll remember that Flags
had four valid values and ValidCommand
has two. Thus CommandOrFlags
has six (4 + 2).
So we can add types, can we multiply them too? Yes, indeed!
Product Types
As you may have noticed, you can think of |
operator for sum types as "or". Product types are the other half of that, the "and". So then we have to find data structures that represent a type having multiple values simultaneously. Tuple is a great candidate for that since it allows us to have different types at its indices.
type Flags = [boolean, boolean]
const flags: Flags = ?
How many potential values can be assigned to flags
?
With this small set we could list them out but we can use what we know about the individual types to figure it out with math.
The type [boolean, boolean]
must have exactly two values and those values must be booleans. Booleans themselves have two potential values. If you know that tuples form a product type with the types they contain you just have to multiply 2 x 2
to know that the total inhabitants of the composed type is four.
Another way of encoding product types is to use an object:
type Point = {x: number; y: number}
So in a way you're already using them!
Sum types are a monoid
Monoid is an interesting mathematical structure that is pretty simple.
-
Composition: Monoid is type with a combine operation that can join to values of the same monoid.
a o b = c
wherea
b
andc
are same type, ando
is the operation. -
Associativity: The operation must be associative
(a o b) o c = a o (b o c)
-
Unit: operation must have a "unit" value that doesn't change the value it's composed with.
a o unit = unit o a = a
As an example Arrays form a monoid where concat
is the operation and []
is the unit:
// Composition
[1,2].concat([3,4])
// is same as
[1,2,3,4]
// Associativity
[1,2].concat([3,4]).concat([5,6)
// is the same as
[1,2].concat([3,4].concat([5,6]))
// Unit
[1,2].concat([])
// same as
[].concat([1, 2])
// same as
[1, 2]
The thing I actually wanted to share is that types themselves form a monoid with |
as the operation and never
as the unit!
// Composition
type Foo = 1 | 2
// Associativity
type Bar = 1 | (2 | 3)
type Bar = (1 | 2) | 3
// Unit
type Baz = 1 | never
type Baz = never | 1
type Baz = 1
Mathematically there is also a monoid for product types but I haven't found a way to make it work in TypeScript.
Getting more mileage out of sum types in TypeScript
In exploring usage of sum types you may run into some challenge in writing functions that use them:
type Error = string
type Message = string
type Response = Error | Message
const handleResponse = (resp: Response) => {
// wait. How do I actually have different behavior
// since both cases are strings at run-time???
}
A trick to make this work is to provide some extra data at runtime so that your code can discriminate between the members of the sum. This is called a discriminated union.
type Error = {kind: 'Error'; message: string}
type Message = {kind: 'Message'; message: string}
type Response = Error | Message
const handleResponse = (resp: Response): string => {
switch(resp.kind){
case 'Error': {
console.error(resp.message)
return 'Fail'
}
case 'Message':
return resp.message
}
}
Switch is used here as poor man's pattern matching. TypeScript has good support for this technique and will make sure you have your cases covered if you leave off the default case.
You can even have empty types to union with states that have no other data:
type Error = {kind: 'Error'; message: string}
type Message = {kind: 'Message'; message: string}
type Pending = {kind: 'Pending'}
type Response = Error | Message | Pending
const handleResponse = (resp: Response): string => {
switch (resp.kind) {
case 'Error': {
console.error(resp.message)
return 'Fail'
}
case 'Message':
return resp.message
}
// TS ERROR: Function lacks ending return statement and return type does not include 'undefined'.ts(2366)
}
Then to fix it you can just handle the case:
const handleResponse = (resp: Response): string => {
switch (resp.kind) {
case 'Error': {
console.error(resp.message)
return 'Fail'
}
case 'Message':
return resp.message
case 'Pending':
return 'Pending'
}
}
That's all for now. I hope this gives you new things to think about and you get a chance to use Sum types to better model your applications.
Edit note: I'd like to give a special thanks to @Phantz for the extra knowledge they dropped in the comments!
Nice interesting article! Something unique around here is very refreshing. But I had a few friendly nitpicks :P
Arrays are not product types
Ok so, let's talk about arrays. An array is not a product type. Is it a sum type? Well, it's complicated. The truth is that traditional arrays (C arrays - pointers to contiguous memory - same as JS) don't encode well in type algebra. In haskell, the traditional array is a GHC primitive. It's really neither a sum type nor a product type. It's a primitive that you build algebraic data types with. Similar to
Int
(except there is a way to represent numbers in type algebra, using a union type of literally all numbers in the set, but it is completely impractical and only conceptual)Now, let's stretch the definition a bit. An array, at the high level, is a "variable length collection of homogenous elements". The type that pertains to this definition and encodes in type algebra very nicely is, of course, the cons list-
I'm not going to even try to encode that in typescript since the bread and butter of algebraic data types is missing - pattern matching (you need this to make a sensible cons list implementations without relying on arrays).
I think you did a fine job on explaining why they are called "sum types" and "product types". But I'd suggest also emphasizing that the notion of the algebra pertains to the domain the types represent. Specifically, how many elements the domain has.
The answer to the question "How many values can this sum type represent?" is "Just add the individuals up!", and similarly for product types. Someone may be confused as to what this phrase means-
We're not multiplying/summing types, we're multiplying/summing their domain sizes.
Is it correct to say type constructors convey composition?
I think this is a bit misleading-
I think it'd be better to say-
I wouldn't use the word composition here since that seems a bit uncommon. I've never quite seen that used in the context of type constructors (or data constructors, in the case of Haskell)
Promise is not a product type
Similar to the other nitpick. It's difficult to encode javascript promises in type algebra. You could stretch the actual definition of the
Promise
class to make it a sum type......sort of? At a very, very basic level, it'd be the most typical sum type of all - theEither
type.The
Promise
type itself does not encode async behavior, that's a runtime thing - abstracted over callbacks. A Promise type simply wraps either failure or success. The catamorphism (haha fancy names) on this would be either. It is equivalent to a.then
that handles both failure and success.Left
andRight
would bereject
andresolve
respectively.At a higher, more useful level that truly models async behavior - we probably wouldn't use promises at all.
Records are not necessarily product types
Ok so records are more or less just maps. An idiomatic
Map
in Haskell looks like this-Isn't it funny how that isn't even a map, just an association list.
That was a half joke, half not. This is the thing about type algebra that bothers me. Encoding actually practical stuff is not idiomatic. Ok, so the
Data.Map
generally used by Haskell (fromcontainers
) is implemented a size balanced tree. This is a low level detail and doesn't actually say much about type algebra. When we're concerned about type algebra - we just pretend that a map is an association list.Anyway, an assoc list is a sum type of product types. Since list is a sum type and its elements are pairs, a classic product type. This gives a much, much clearer notion on how to calculate the domain size of a record/map. Except you know, a list is a recursive sum type with one branch as a product type anyway, so good luck with that.
By the way I totally see the rationale behind calling records product types. But the key detail we're missing here is that records don't look like this (singleton)-
(where
k
is a value of typeK
andt
is a value of typeT
.)Instead, it looks something like this-
that is, it may not hold a singular value - but many, in completely variable length.
Notice how the domain size intuition of
K x T
only makes sense in the first case. Where one of theRecord<K, T>
is exactly{ k: t }
, another value of that type could be{ k1: t1 }
and so on for all combinations of K and T.But in the actual case, the
Record<K, T>
is not like that. Product types must give a notion about the domain size. That's all broken when variable length types are involved. In reality, the domainRecord<K, T>
has infinitely more values thanK x T
.Monoids are really cool, thanks for including them!
Great observation on monoids. Requoting something I've heard many others say- "Monoids are gentle sea creatures that appear everywhere in programming, you just have to notice them!". You could expand the definition by introducing semigroups first too! It's even more common to stumble upon semigroups in one's daily venture into coding, monoids are just semigroups with an identity element.