Understanding Simplicity: implementing a smart contract language in 30 lines of Haskell

Source: Medium

Last Monday, Chain released our open-source smart contract language for Bitcoin, Ivy. Ivy is a higher-level language that can compile to Bitcoin Script, the low-level language used by the Bitcoin protocol to determine whether a transaction is authorized. Ivy makes it easier for humans to read, write, and use Bitcoin smart contracts. But since it is designed to be compatible with Bitcoin today, its features are limited by the current capabilities of Bitcoin’s virtual machine.

What if those limits could be expanded? What if it were possible to get almost all of the safety, simplicity, and efficiency of a small, non-Turing-complete execution language like Bitcoin Script, while gaining much of the flexibility and power of a more expressive programming language?

For this post, we’re going to take a peek at a distant possible future of the Bitcoin protocol, in the form of an exciting paper from Russell O’Connor of Blockstream. O’Connor’s paper proposes a possible replacement for Bitcoin Script, called Simplicity. Like Bitcoin Script, Simplicity is a low-level execution language, intended to be a compilation target for higher-level languages like Ivy. (To learn more about the possible advantages of Simplicity over Bitcoin Script and alternative low-level languages, you can see Blockstream’s announcement, or this review from Philip Wadler.)

O’Connor’s paper lays out everything you need to implement a Simplicity interpreter, but doesn’t provide an executable reference implementation (although he is working to release an SDK in the near future). And following along with the paper can be a little daunting if, like me, you’re not very familiar with the syntax of Coq or sequent calculus.

This post attempts to address both these issues, by demonstrating how to implement core Simplicity using Haskell. Our implementation will use only elementary Haskell — specifically, material from chapters 2, 4, 7, and 11 of Haskell Programming from First Principles. If you’ve ever used Haskell or another functional language like Elm, F#, PureScript, or OCaml, you’ll probably be able to follow along.

You can try out this implementation of Simplicity in your browser, and see the full code, at https://repl.it/@danrobinson/core-simplicity.

In future posts, I might take a look at implementing some of the other features of Simplicity, embedding Simplicity in a less hospitable (but more popular) language, and even building part of a compiler from Ivy to Simplicity.

Goals

We’ll be implementing core Simplicity as a Haskell module. This module exports functions that allows you to write and execute Simplicity expressions as Haskell code, with almost no translation. For example, this Simplicity program from section 2.5 of the paper:

Can be translated into the following Haskell code:

not :: Bit -> Bit
not = comp (pair iden unit) (match (injr unit) (injl unit))

Our implementation will be a shallow embedding. This means every Simplicity combinator and expression will correspond to a Haskell function, and every Simplicity value will correspond to a Haskell value. This allows us to piggyback on Haskell’s AST, as well as its type inference and type checker. If an expression is a valid Haskell expression, we know it is also a valid Simplicity expression.

In this post, we’ll only implement the nine basic combinators of core Simplicity (described in section 2 of O’Connor’s paper). We won’t touch the more sophisticated features of Simplicity, such as sub-expression sharing, merklized abstract syntax trees, and jets.

Types and values

Simplicity has a unit type, 𝟙. (A “unit type” means a type with only one inhabitant.) To represent this type, we will use Haskell’s built-in unit type, ()(pronounced “unit”).

In Simplicity, the sole value of the unit type looks like ⟨⟩. In Haskell, the value has the same name as the type itself, ().

Simplicity also has sum types, written A + B, where A and B are other Simplicity types. Each inhabitant of this type is a value of either type A or B, along with a tag indicating whether it is from the left or the right type. For sum types, we will use Haskell’s canonical sum type, Either, which is defined as data Either a b = Left a | Right b.

In Simplicity, the values of this type look like σᴸ(a) (when the contained value is a value of the first type, type A) or σᴿ(b) (when the contained value is a value of the second type, type B). In Haskell, they look like Left a and Right b. For example, the two values that inhabit type Either () () are Left () and Right (). The three inhabitants of type Either (Either () ()) () are Left (Left ()), Left (Right ()), and Right ().

Finally, Simplicity has product types, written A x B. Each inhabitant of such a type is a pair of values — one of type A, and one of type B. We will use Haskell’s built-in tuple type, (a, b).

In Simplicity, values of a product type look like ⟨a, b⟩. In Haskell, they look like (a, b). For example, the four inhabitants of type (Either () (), Either () ()) are (Left (), Left ()), (Left (), Right ()), (Right (), Left ()), and (Right (), Right ()).

Unlike in Haskell, sum and product types cannot be defined recursively. This means that every Simplicity type has a fixed, finite number of possible inhabitants.

Numbers and strings

You’ll notice that Simplicity does not have any built-in support for common primitive types like booleans, integers, or strings. Instead, the Simplicity paper defines such types as combinations of the types described above.

For example, the paper defines a “bit” — a type with two inhabitants, zero and one — as the sum type 𝟙 + 𝟙. In Haskell, this corresponds to the type Either () (), and the two values of it are Left () and Right (). Following the convention used in the paper, we will define these values to mean 0 and 1, respectively.

For our convenience, we can define Bit as a type synonym, along with more descriptive names for its two inhabitants:

type Bit = Either () ()

zero :: Bit
zero = Left ()

one :: Bit
one = Right ()

What if we want to represent not just one bit, but a two-bit “word” — something that can represent any number from 0 to 3? We can take a product type of two bits, (Bit, Bit).

type Word2 = (Bit, Bit)

This type can have 4 possible values: (zero, zero), (zero, one), (one, zero), and (one, one). We can interpret these as the integers 0, 1, 2, and 3, respectively. Alternatively, we could interpret them as bitstrings of length 2.

The Simplicity paper defines larger word sizes as products of two smaller word size types. A 4-bit word is defined as the product of two 2-bit words, an 8-bit word is defined as the product of two 4-bit words, and so on.

type Word4 = (Word2, Word2)
type Word8 = (Word4, Word4)

Expressions and combinators

A Simplicity “expression” is a function from one Simplicity value to another. This is a little unusual — you’re probably accustomed to expressions that can evaluate to single values. (Simplicity also does not have first-class functions — Simplicity expressions are not themselves Simplicity values.)

Simplicity expressions look like this:

pair iden unit

take (take iden)

comp (pair iden unit) (case (injr unit) (injl unit))

Simplicity expressions are composed of combinators. A combinator is a primitive in Simplicity that turns 0 or more expressions into a new expression. There are only nine combinators in core Simplicity.

iden :: a -> a
iden a = a

The iden combinator is not parameterized with any other expressions — iden by itself is a valid Simplicity expression. The expression iden is a function that takes a Simplicity value and returns the same value.

unit :: a -> ()
unit _ = ()

The unit combinator, similarly, does not take any other expressions. unit is a function that ignores its input and returns a single value of type unit, ().

comp :: (a -> b) -> (b -> c) -> a -> c
comp s t a = t (s a)

The comp combinator takes two expressions, s and t. Remember, each of these expressions is itself a function from one Simplicity value to another. When partially applied to s and t, comp returns an new expression that takes some value a, applies the s expression to it, and then applies the texpression to that result.

Note that the last argument to comp, a, is a Simplicity value. In contrast, all of the previous arguments are Simplicity expressions (i.e., Haskell functions). Because of Haskell’s automatic currying, you can also think of comp as a function that takes two expressions s and t, and returns a new expression (i.e., a new Haskell function from one value to another).

                                    [         expression          ]
combinator   expression  expression     input value   output value
             (* -> *)    (* -> *)    -> *          -> * 
comp ::      (a -> b) -> (b -> c)    -> a          -> c
comp         s           t              a          -> t (s a)

This is a useful perspective, and arguably the “correct” one, not only because of the different nature of the arguments but also because of the difference in when these arguments are passed. The expressions s and t are provided at the time the Simplicity program is written (i.e., by the programmer or compiler), and the value a is provided at runtime, when the program is run.

injl :: (a -> b) -> a -> Either b c
injl t a = Left (t a)

injr :: (a -> c) -> a -> Either b c
injr t a = Right (t a)

The injl and injr combinators each take one expression, t. When applied to a value, they first apply the expression t, and then inject the result into a sum type by adding a Left or Right tag, respectively.

match :: ((a, c) -> d) -> ((b, c) -> d) -> (Either a b, c) -> d
match s t (Left a, c) = s (a, c)
match s t (Right b, c) = t (b, c)

The match combinator (renamed from Simplicity’s case because case is a reserved word in Haskell) takes two expressions, s and t. When applied to a value, which must be a product (ab, c), it pattern-matches on ab, and applies either s or t to the contained value (along with the extra state c) based on whether the tag is a `Left` or `Right`, respectively.

pair :: (a -> b) -> (a -> c) -> a -> (b, c)
pair s t a = (s a, t a)

The pair combinator takes two expressions, s and t. When applied to a value, it creates a pair whose first value is the result of applying s to a, and whose second value is the result of applying t to a.

take :: (a -> c) -> (a, b) -> c
take f (a, _) = f a

drop :: (b -> c) -> (a, b) -> c
drop f (_, b) = f b

The take and drop combinators each take one expression, f. When applied to a value — which must be a product (a, b) — they apply the expression f to the first or second value, respectively, and throw the other value away.

Example Simplicity programs

Now we can use these combinators to define expressions that represent arbitrary functions from inputs to outputs.

The following examples are taken almost verbatim from the Simplicity paper (just translating the type names and type annotation syntax, and changing case to match).

not :: Bit -> Bit
not = comp (pair iden unit) (match (injr unit) (injl unit))

This function takes one bit (zero or one, i.e., Left () or Right ()) and converts it into the opposite bit.

How does this work? Suppose we apply this expression to the value Left (). We can use equational reasoning to gradually simplify this expression, one combinator at a time:

comp (pair iden unit) (match (injr unit) (injl unit)) (Left ())
-- comp s t a = t (s a)
match (injr unit) (injl unit) ((pair iden unit) (Left ()))
-- pair s t a = (s a, t a)
match (injr unit) (injl unit) (iden (Left ()), unit (Left ()))
-- iden a = a
match (injr unit) (injl unit) (Left (), unit (Left ()))
-- match s t (Left a, c) = s (a, c)
injr unit ((), unit (Left ()))
-- injr s a = Right (s a)
Right (unit ((), unit (Left ())))
-- unit _ = ()
Right ()

To try this out, you can use the online REPL at https://repl.it/@danrobinson/core-simplicity, or download the code from there and try it yourself in ghci:

> not zero
=> Right ()
> not zero == one
=> True
> not one == zero
=> True

This next function takes two one-bit numbers, adds them, and returns an overflow bit and a result:

halfAdder :: (Bit, Bit) -> (Bit, Bit)
halfAdder = match (drop (pair (injl unit) iden)) (drop (pair iden not))

You can use this half-adder to build a full-adder, which takes two one-bit numbers and an overflow bit and returns a new overflow bit as a result:

fullAdder1 :: ((Bit, Bit), Bit) -> (Bit, Bit)
fullAdder1 = comp (pair (take halfAdder) (drop iden))
            (comp (pair (take (take iden))
            (comp (pair (take (drop iden)) (drop iden))
              halfAdder))
            (pair (match (drop (take iden)) (injr unit))
              (drop (drop iden))))

You can try it out too:

> fullAdder1 ((one, one), zero)=> (Right (),Left ())
> fullAdder1 ((zero, zero), one)
=> (Left (),Right ())

These last two functions implement adders for 2-bit words and 4-bit words. (The Simplicity paper defines these recursively, but doing so in Haskell would require some type-level magic that’s beyond the scope of this post.)

fullAdder2 :: ((Word2, Word2), Bit) -> (Bit, Word2)
fullAdder2 = comp (pair (take (pair (take (take iden))
                                   (drop (take iden))))
            (comp (pair (take (pair (take (drop iden))
                                    (drop (drop iden))))
                        (drop iden))
                   fullAdder1))
            (comp (pair (drop (drop iden))
                  (comp (pair (take iden)
                              (drop (take iden)))
                        fullAdder1))
            (pair (drop (take iden))
                  (pair (drop (drop iden)) (take iden))))

fullAdder4 :: ((Word4, Word4), Bit) -> (Bit, Word4)
fullAdder4 = comp (pair (take (pair (take (take iden))
                                    (drop (take iden))))
            (comp (pair (take (pair (take (drop iden))
                                    (drop (drop iden))))
                        (drop iden))
                    fullAdder2))
            (comp (pair (drop (drop iden))
                  (comp (pair (take iden)
                              (drop (take iden)))
                        fullAdder2))
            (pair (drop (take iden))
                  (pair (drop (drop iden)) (take iden))))

Conclusion

Hopefully this post has made it easier to understand and try out core Simplicity. I also hope it demonstrated some of the power and flexibility that Haskell provides when working with domain-specific languages.

Thanks to Russell O’Connor, Boyma Fahnbulleh, Bob Glickstein, and Keith Rarick for reviewing a draft of this post.

Donate

Donate with Lightning