Parsing permutation phrases

Records in Haskell, as in many other languages, allow fields to be specified in any order, as long as they don’t appear more than once; a field may also be omitted.

Is there a principled way to add a corresponding feature to a parser combinator library? We’d want a combinator that took parsers for each field and returned a parser for the whole record. It’s important too that the resultant parser be reasonably efficient. We could use such a combinator also for XML, JSON, YAML, URL query parameters, and so on.

I’ll walk us through the implementation from the paper and the modified versions existing in the Haskell ecosystem. These rely on lazy evaluation; I’ll show how the technique can be adapted for languages with strict evaluation.


Functional Pearl: Higher-kinded data

A trick that can give an algebraic data type extra powers is to add a parameter that’s not merely a type, but a type constructor.

What kinds of extra powers? Well, for instance, we can:

  • create type-safe builders, where it’s compilation error if a field is not set — but you can customise which fields are mandatory
  • create “smart” records, that we might use to back a form on a webpage, so that we can:
    • make all fields empty initially, or set them to default Text values — even for fields holding other types, such as Int
    • query to ask which fields were not set, or did not parse, or pass validation; we can then export (or import) that data as a map keyed by field name
    • finally convert to a record with standard, bona-fide fields of the expected types
  • replace the whole family of different abstract syntax trees typically used in different passes of a compiler pipeline with a single data structure
  • add additional safety guarantees: e.g. a table-printing library where it becomes a compilation error to provide rows of different lengths
  • enable automatic diffing of data structures
  • and in general, simplify patterns of processing data with similar structures.

I’ll compare and contrast several different approaches to working effectively with higher-kinded data.

First, I’ll build up a small library with lifted versions of familiar type classes like Functor, Foldable, Traversable that will work for our new parameterized types, along with a number of combinators and utility functions.

I’ll walk us through creating manual instances for these new type classes, and then discuss approaches using GHC Generics and Template Haskell.

Finally I’ll show a different approach based on type families, as employed in the higgledy package.


A monad for Set

Everyone knows it’s impossible to write a monad instance for Set in Haskell, because of the annoying Ord constraint that comes with Set, but which isn’t part of the definition of the standard Monad type class. But what if we ignored the naysayers, and just did it anyway?

That is to say, I’ll show you how we can write a genuine, law-abiding instance:

  • with no unusual language extensions
  • without modifying the Monad class, and
  • using the standard Set implementation from the containers package.

Using a set instead of another data structure such as a list can provide exponential speed-ups for certain algorithms, so this is of more than purely theoretical interest.

In fact, I’ll walk us through several implementations of set monads in Haskell, compare the potential drawbacks of each, and show how we might address these.


Languages and patterns

I’ll demo some tools I wrote to help myself learn German and Turkish that create example sentences from a generative grammar model along with their translations into English, segueing into a discussion on agglutinative languages and vowel harmony, and using this as an opportunity to talk about how some tricks with patterns in Haskell (and and or patterns) can help us write cleaner code.

Links from the talk


Golden round-trip testing

In this talk I introduce golden round-trip testing.

Golden tests1 and property-based round-trip testing2 are two heavy-hitters that deserve to be in every programmer’s testing arsenal.

It turns out that a mashup of the two is even better!


  1. https://hackage.haskell.org/package/tasty-golden↩︎

  2. https://hackage.haskell.org/package/hedgehog/docs/Hedgehog-Internal-Tripping.html↩︎


A taste of topos theory

For a computer programmer, graphs might seem commonplace and not particularly interesting. But in fact they secretly come with a lot of extra structure. For instance:

  • we can add and multiply graphs
  • the structure-preserving maps (graph homomorphisms) between any two graphs form another graph
  • for every graph G, there’s a “powergraph” of subgraphs of G
  • subgraphs of G correspond to predicates on G, but where instead of the booleans we use for predicates on sets, there’s a special graph Ω to use instead.

So where does this structure come from? The answer is the mathematical field of topos theory.

There are many ways to approach topos theory. One view is that the concept of “topos” allows us to extract an interface from set theory. Now we can apply this interface to other domains — provided they also fit the definition of a topos. In other words, we can continue to use the language of sets in other, more complicated domains. Luckily, there are lots of other topoi!

This interface turns out to be a typed lambda calculus, and one with a lot of nice features: it has sums and products, it’s dependently typed, it has refinement types with intersection and union i.e. we can form subtypes by applying a predicate, and very frequently it comes with other goodies like modalities.

By happy coincidence some of the most common constructions in discrete maths and computer science - graphs (of multiple flavours), automata - are topoi, or at least quasitopoi, a minor variation.

I’ll show how to derive the special graph Ω that serves as a replacement for the booleans when forming predicates on graphs.


Dependent types made difficult

In the spirit of Mathematics Made Difficult and Monads Made Difficult, let’s take a deeper look at dependent types to see if we can come to understand them a little better.

The field of categorical semantics tells us that the simply typed lambda calculus has a natural interpretation in any Cartesian closed category: it is their “internal language”.

What’s an internal language good for? On one hand, having a high-level type theory for talking about low-level properties can greatly simplify proofs.

On the other hand, this language can have very practical consequences. As he describes in his paper “Compiling to Categories”1, Conal Elliott was able to put this correspondence to work in the form of a GHC plugin that allows the same program to be interpreted in multiple different CCCs — yielding applications including hardware circuits, automatic differentiation, incremental compilation and interval analysis.

What’s the story for dependent types i.e. is there a categorical semantics for dependent types?

It turns out that dependent type theory is the internal logic of locally Cartesian closed categories: ones where the pullback functor has both left and right adjoints.

Pullback corresponds to substitution into a dependent type. These left and right adjoints are called dependent sum and product, and correspond to dependent sum and product types.

I’ll explore what this looks like via explicit constructions in the category of sets (with pictures!).


  1. http://conal.net/papers/compiling-to-categories/↩︎


Compile-time parsing

Sometimes, rather than constructing a domain object in code, it’s clearer to specify it as YAML or JSON. But at the same time, we don’t want to give up the compile-time guarantees we get from a programming language, as this would expose us to runtime errors if we made a mistake in the YAML/JSON.

Can we write in another language and then check it at compile time? We can, through the power of macros!

I’ll show how to build useful compiler checking of arbitrary strings.

Key ingredients will be:

  • macros that produce macros
  • string interpolators, and
  • compile-time evaluation.

All of basic category theory

If you’ve learnt about lenses in functional programming, you may have been puzzled by the claim that

data Lens s a = Lens { get :: s -> a, set :: a -> s -> s }

is in some sense the same as

forall f. Functor f => (a -> f a) -> s -> f s

More to the point: how on Earth did someone ever figure this out in the first place?

Don’t know your Kan extensions from your co-ends, your pushouts from your presheaves? Join me for a scenic adventure tour through the magical land of category theory, stopping off at all the major sights.

We’ll learn the basic notions that form the conceptual backbone of category theory, and how they all fit together. I’ll also explain how the above equivalence of lens representations comes about.

Category theory has made deep inroads into computer science theory, but in this talk we’ll be focused on computer science practice. We’ll explore the advantages category theory brings to programming in terms of

  • providing alternate representations for types
  • classifying solutions
  • simply providing a clarifying viewpoint and helping to organise our thinking.

In addition this should provide you with the right framework for further exploring category theory, should you so wish.

Outline/Structure of the Talk

I’ll cover the fundamentals of, and some applications of, the universal constructions of basic category theory:

  • initial objects
  • universal properties (universal arrows, universal elements)
  • representable functors
  • limits
  • adjunctions
  • monads
  • ends
  • Kan extensions

Learning Outcomes

Attendees will come away with a better idea of how the different concepts and tools of category theory fit together and how to use them.

Target Audience

Functional programmers who are curious about category theory, but feel overwhelmed when they turn to textbooks or resources such as the nLab to learn more.

Prerequisites for Attendees

Ideally, participants would already have seen the definition of a category and be familiar with Haskell syntax. I’m also going to assume participants have already been exposed to monads and comonads, and to free constructions (e.g. free monoid, free monad) as I will only touch on these lightly.


Conditional contexts

Wouldn’t it be nice if Scala would let you write a parametric function that could do different things depending on whether or not the type parameter was an instance of some type class?

For instance

  • a Set[A](a: A*) function that creates either a TreeSet or a HashSet, depending on whether an Ordering[A] is in scope.
  • a numerical algorithm doCalc[N : Numeric](n: Matrix[N]) that works for all numeric types N, but uses a (more expensive, but numerically stable) algorithm for floating point numbers, but not integers (which don’t need it).

I’ll run through the use and implementation of a Haskell library (ifcxt) which provides this functionality, then show the (surprisingly simple) Scala implementation.

Notably, this has to be an extension to the type system in Haskell, but practically comes for free in Scala.

Accompanying Scala code allows you to play with three different implementations.


Monadic matching mishaps

The following Scala code fails to compile:

  // Search for a file by filename glob
  def find(glob: String): Error \/ File

  // retrieve the first and last lines
  def firstLastLine(file: File): Error \/ (String, String)

  // summarize the first text file
  val summary = for {
    file          <- find("*.txt")
    (first, last) <- firstLastLine(file)
  } yield file + ": " + first.take(20) + "..." + last.takeRight(20)

with the error

could not find implicit value for parameter M: scalaz.Monoid[Error]

but it’s not immediately clear why, or why it does work when we replace \/ with Option.

I’ll discuss why this occurs, a simple fix to make it work, and compare with similar situations in other languages with pattern-matching and monads (Haskell, Idris, Purescript), ending by identifying a Scala feature that perhaps could be improved in future compiler versions.


How do functional programmers do dependency injection?

How should we think about dependency injection?

In the context of imperative languages there are many well-established frameworks for doing dependency injection. Is there a functional programming equivalent?

We often hear that the FP answer is “use a Reader monad”. However, I’ll argue this isn’t a good answer.

A functional programmer would instead turn the question around and ask “What operations would you do if you had these dependencies?”. By reframing the original problem we reveal a more fundamental outlook. Namely, what are the languages of operations these dependencies allow us to express, and what are the interpreters for those languages?

To achieve reuse, both languages and interpreters will need to be modular, i.e. support good notions of composition.

So how do we go about doing this?

I’ll compare and contrast two solutions of this problem. The first, popularised in Wouter Swierstra’s Functional Pearl Datatypes à la Carte (and familiar to some through Rúnar Bjarnason’s talk “Compositional application architecture with reasonably priced free monads”) builds up a data structure of abstract operations, which are then run by an interpreter.

The second approach — popularised by Oleg Kiselyov as Typed tagless final interpreters — takes advantage of Scala’s support for higher-kinded types. Rather than building up and tearing down a data structure, we simply add a type class constraint to our program. “Interpretation” is now a compile-time operation: it’s just specialisation to a type that satisfies the constraints. This is both simpler and more efficient, and sufficient for almost all purposes.


Stop paying for free monads!

Free monads, as used in the Datatypes à la carte pattern, are a useful way to structure code, separating of specification from implementation and enabling modularisation.

But they also come with a runtime performance penalty…

The complementary typed tagless final interpreters approach (popularised in the Haskell and OCaml worlds by Oleg Kiselyov) offers a performance boost over free monads.

Yet, frustratingly, available sample code centres around simple expression languages, and “real world” examples that you might be able to put to use in your day job are hard to come by.

In this talk, we’ll compare and contrast the typed tagless approach with datatypes à la carte and explore how the typed tagless approach can be used in real production code — with configuration, state, side effects and error handling.

Haskell code for the associated LambdaJam workshop.


Macro-based smart constructors

Smart constructors are a nice pattern for building data types that need extra validation or calculation.

But in the case when we’re building our data type from statically known values, they leave us with two annoyances:

• they’re more clunky

• we have a run-time test for something that’s known at and should be checked at compile time.

We’ll investigate how Scala macros can help us address these problems and thus achieve both additional safety and greater convenience.

Accompanying Scala code demonstrating the techniques in action.


Seven trees in one

It’s a rather surprising fact that there’s an O(1) bijection between the datatype of unlabelled (planar) binary trees

data Tree = Leaf | Node Tree Tree

and 7-tuples of these trees. This presentation explores the fascinating story of how this bijection comes about, and the interesting connections to notions of isomorphism of types and distributive categories.

Haskell code containing a QuickCheck test of the bijection.


Getting higher: Higher-kinded and higher rank types in Scala

Some of us may have heard the terms “higher-kinded type” or “higher rank type” and wondered what they meant, or what the difference was.

In this talk I’ll attempt to demystify these terms and give a few examples of how they can be useful in practice.


Much ado about monoids

I’ll introduce the Monoid and Foldable type classes from Scala’s scalaz library and show how the compositionality property they satisfy means they can be used to do basic data analysis in a single pass over the data.


Spot an error or typo? Please let me know.