Hi, Barbie! Higher kinded records made easy(er)

Foreword

In part 2 of "There and Back Again", I demonstrated an interface for easily taking valid Haskell functions and turning them into references and definitions for an RPC system. We then briefly discussed the idea of registering these definitions into a Map so that we could look them up by name and invoke them. However, defining each function individually, picking a name, and registering it into a Map is a lot of boilerplate. In this article, we'll explore a way to automate this process using higher-kinded records and the barbies library. This will give us a way to define a single record type that contains all of our functions, and then automatically register them all at once.

Higher-kinded records

If you've used Haskell for a while, you may have run across the higher-kinded data pattern. The idea, if you haven't encountered it before, is a pretty neat trick, and you've probably situations in your own code where it would have been useful. Let's start with a simple example. Suppose we have a simple contact management CRUD app that we use to manage a list of our friends and their email addresses. We might start with a simple record type to represent a contact:

data Contact = Contact
  { name :: Text
  , email :: Text
  } deriving (Generic)

instance ToJSON Contact
instance FromJSON Contact

We use this type to create and return values of Contact to our web frontend via an API, and all is well. But one sorrowful day, we accidentally mix things up, writing our buddy Bob's name in the email field and leaving the name field blank. We realize that we need to add an endpoint to our API to update a contact's email address. We think about it for a moment, and decide that we'd also like to add validation to our create and update endpoints, so that we can make sure that we can't make the same mistake again. So, we make a new type:

data ContactValidation = ContactValidation 
  { name :: Either ValidationError Text
  , email :: Either ValidationError Text
  } deriving (Generic)
instance ToJSON ContactValidation
instance FromJSON ContactValidation

It might also be cool, we think, to support partial updates, so that instead of updating the entire contact, we can just update the fields that we want to change. So, we make another type:

data ContactUpdate = ContactUpdate
  { name :: Maybe Text
  , email :: Maybe Text
  } deriving (Generic)

Now, we have three different types that are all very similar, but not quite the same. This is... getting a little out of hand. A little bit tedious. We like the idea of Haskell, but this isn't really what we signed up for. How can we make this better?

This is where higher-kinded data comes in. We define our base type in mostly the same way, but we parametrize it such that it takes a "wrapper" type:

data Contact f = Contact
  { name :: f Text
  , email :: f Text
  } deriving (Generic)

Now, we can define our three types in terms of this base type:

type ValidContact = Contact Identity
type ContactValidation = Contact (Validation [ValidationError])
type ContactUpdate = Contact Maybe

Cool! This is a lot cleaner. Very generic. Such parameterization.

Now, at first this seems like a pretty nifty trick, but there a lot of little annoyances that often cause Haskellers to shy away from this pattern.

Let's look at a few of them.

Working with the plain record

You start to write a basic model:

testContact = Contact
  { name = "Bob"
  , email = "[email protected]"
  }

But you quickly realize that this isn't going to work. You can't just write a plain record, because Contact is a higher-kinded type. You need to use some kind of wrapper. Identity is the easy choice– it doesn't do anything, since it's defined as newtype Identity a = Identity a.

testContact = Contact
  { name = Identity "Bob"
  , email = Identity "[email protected]"
  }

Alright, that works, but... boilerplate. You have to wrap every field in Identity just to get it to compile. And then, when you want to actually use the data, you have to unwrap it again. Bah!

Converting between f types

Well, it can't be helped, you think. You decide to write your validation logic. It will turn a ContactValidation into a ValidContact if all of the fields are valid.

validateContact :: ContactValidation -> Validation [ValidationError] ValidContact
validateContact (Contact name email) = Contact <$> name <*> email

So... you have to reconstruct the entire record, field by field? Fine for two fields, but what if you have 10? 20? No thank you.

barbies to the rescue

Surely someone has already solved this problem, you think. You are in luck. barbies provides some higher-level classes for working with higher-kinded data.

{-# LANGUAGE DeriveAnyClass #-}
import Barbies

data Contact f = Contact
  { name :: f Text
  , email :: f Text
  } deriving (Generic, FunctorB, TraversableB, ApplicativeB, ConstraintsB)

Barbies provides a number of classes that look pretty much like the standard Functor, Traversable, and Applicative classes, but with a little bit of a twist– they operate on the shape of the record itself using the provided f type, rather than the type of the functor's contents. This gives us a way to transform / collapse / mash together the structure of the record itself, rather than the contents of the fields themselves. This might initially seem like its own limitation, but it largely works quite well as long as you can embed the information you need to work with into f itself. We'll work backwards through the previous problems and see how barbies helps us solve them.

Costumes, changed quickly

validateContact above is really just using Applicative to reconstruct the record. barbies provides a bsequence function that does exactly this, and a slighly nicer version when turning the base functor into Identity:

validateContact = bsequence'

If we want to do a straight natural transformation, we can use bmap– say for example we wanted to convert a valid Contact back into the validation version for some reason:

toValidation :: ValidContact -> ContactValidation
toValidation = bmap (\(Identity a) -> Valid a)

There a number of other functions in barbies that make working with higher-kinded data easier, so I won't go into all of them here. The documentation is quite thorough, so no reason for me to try to improve on it.

Solving our Identity crisis

If you've worked with HKD before, you may have seen type family tricks to get around the Identity problem we ran into when we first introduced the f parameter into our Contact type. barbies provides a module called Barbies.Bare that remediates this. It provides two things: a type family called Wear that can be used to convert a higher-kinded record into a plain record, and a set of utility functions and type classes to interchange between the a higher-kinded record and its plain version. First up, let's look at the type family that we're provided:

data Bare
data Covered

type family Wear t f a where
  Wear Bare    f a = a
  Wear Covered f a = f a

Type families are essentially functions that operate at the type level. In this case, you can more or less read this as "if t is Bare, then the resulting type is a, otherwise it's f a". In order to use this, we have to tweak our record definition a bit:

data Contact t f = Contact
  { name :: Wear t f Text
  , email :: Wear t f Text
  }
  deriving (Generic)

instance FunctorB (Contact Covered)
instance TraversableB (Contact Covered)
instance ApplicativeB (Contact Covered)
instance ConstraintsB (Contact Covered)
-- This is a new instance that provides conversion to and from the plain record
instance BareB Contact

Now, if we want to work with the plain record, we can pass in Bare as the first type parameter:

testContact :: Contact Bare Identity -- the second parameter doesn't actually matter here, but I like to use Identity for clarity.
testContact = Contact
  { name = "Bob"
  , email = "[email protected]"
  }

Now, to promote into a higher-kinded record, we can use the bcover / bcoverWith functions:

toValidation :: Contact Bare Identity -> Contact Covered (Validation [ValidationError])
toValidation = bcoverWith pure

Tidying up a little more via barbies-th

So we've mostly solved our Identity problem, but we still have a few more annoyances:

  • Barbies uses Generic to derive instances for the higher-kinded record, and Generic-based code is notoriously slow and memory hungry. This is especially problematic when working with large records, as we might be doing in our RPC system.
  • We still have to write a lot of instances out, and I'd rather all of that machinery is handled without me having to think about it.
  • When defining fields in a record, we have to write Wear t f every time, which I find harder to parse at a glance for the main type of the field.

The barbies-th package provides a Template Haskell-based solution to these problems. It provides a few Template Haskell functions that can be used to generate the boilerplate for us. I like the one called passthroughBareB:

passthroughBareB [d|
  data Contact = Contact
    { name :: Text
    , email :: Text
    }
  |]

This simpler definition generates the following code for us:

data ContactB t f = Contact
  { name :: Wear t f Text
  , email :: Wear t f Text
  }

instance BareB ContactB
instance FieldNamesB (ContactB Covered) where
  bfieldNames = ContactB (Const "name") (Const "email")
instance ProductB (ContactB Covered) where
  bprod (ContactB xname xemail) (ContactB yname yemail)
    = ContactB (Pair xname yname) (Pair xemail yemail)
instance FunctorB (ContactB Covered) where ...
instance TraversableB (ContactB Covered) where ...
instance ConstraintsB (ContactB Covered)
instance ProductBC (ContactB Covered)

type Contact = ContactB Bare Identity
type ContactH = ContactB Covered

Nice! This is a lot cleaner, and the instances are more efficient. This gives us everything we need to work with higher-kinded records in a way that is both efficient and not annoying to maintain.

One last thing to note before moving on– it's easy to miss, but there's a new instance in the above called FieldNamesB that provides a way to reflect the field names of a record into a Barbie type. This is useful for our RPC system, as we'll see in a moment.

A quick aside– analogues in other languages

If you've used TypeScript, you may have run across the concept of "mapped types". These essentially allow you to take a type and transform it into a new type by applying a function to each of its fields. This is what we're doing via barbies– retaining the structure of the record, but transforming each of the fields into something new.

So in TypeScript, we might have something like this:

type Contact = {
  name: string
  email: string
}

type ValidatedContact = {
  [K in keyof Contact]: Validation<Contact[K]>
}

type ContactUpdate = {
  [K in keyof Contact]?: Contact[K]
}

The Haskell version is a little more verbose, but the idea is the same. What I think is neat about the Haskell version despite the extra work needed is that we are able to retrofit this functionality in using a library even though the language doesn't have it built in.

Back to building our RPC system!

As I mentioned in the introduction, we want to use this technique to make RPC usage pleasant to work with. Let's start by defining a record type that represents as few basic RPC calls that we want a server to provide and a client able to invoke:

passthroughBareB [d|
  data MyRpcCalls = MyRpcCalls
    { echo :: Text -> IO Text
    , add :: Int -> Int -> IO Int
    }
  |]

rpcDefinitions :: MyRpcCalls
rpcDefinitions = MyRpcCalls
  { echo = \t -> pure t
  , add = \x y -> pure (x + y)
  }

Let's have a quick refresher on the types and functions we're working with from the previous article before moving on.

  1. We have a type called RpcDefinition that represents a single RPC definition. It has a name, and a function wrapper that converts and applies a sequence of untyped payloads to the underlying function that the server provides.
  2. We have a type called RpcRef that is a means for the client to invoke a function on the server. It has a name, a serialization codec, a type-level list of arguments, and a type-level representation of the return type.
  3. We have a type called RegisteredRpcCalls that is a map of RpcDefinitions, indexed by name.

So, first up, let's build our RpcDefinitions. Let's start by iterating over the fields of our MyRpcCalls record, and for each field, create an RpcDefinition.

We'll need to use a few tricks here– first, we'll need to use bfieldNames to get the field names of the record. Then, we'll need to use bzipWith to combine the field names with the functions themselves. Lastly, we have to prove to GHC that for a given codec, that all arguments and return types across each of the record field functions are compatible with that codec. We'll do this using the ConstraintsB class from barbies.

With regards to bfieldNames, I want to take a moment to highlight a trick that factors in heavily with the rest of the implementation: using Const as the functor.

Const is defined as follows:

newtype Const a b = Const a
instance Functor (Const a) where
  fmap _ (Const a) = Const a

The beauty of Const here is that we can ignore the actual value in each field of the record, and instead use a value of the same type in each field. Concretely, let's use bfieldNames as an example. The definition of it for MyRpcCalls would be as follows:

instance FieldNamesB MyRpcCalls where
  bfieldNames = MyRpcCalls 
    { echo = Const "echo" 
    , add = Const "add"
    }
  ...

So now we can materialize field names for each field of the record. What's next? Well, we need to figure out how to turn each field into an RpcDefinition.

The first tool we'll use is bzipWith. This is a function that takes two records of the same base type (with potentially different functors), and combines them into a single record. Waving our hands a little bit, we want to use it to combine the field names with the functions themselves:

implementationsToDefinitions :: forall codec rec. codec -> rec Bare f -> rec Covered (Const RpcDefinition)
implementationsToDefinitions c r = bzipWith 
  (\(Const name) (Identity f) -> Const $ rpcDef name c f)
  rNames
  r'
  where
    -- To use most of the functions in barbies, we have to be operating in a functor context rather than a bare record context.
    -- `bcover` wraps each field in the `Identity` functor for us.
    r' = bcover r
    -- `bfieldNames` gives us a record of the field names of the record.
    rNames :: rec Covered (Const Text)
    rNames = bfieldNames

This is pretty close to what we want, but it also doesn't compile! The compiler needs to that everything is compatible with the codec we're using. So we'll tweak the above to use bzipWithC instead, which provides machinery for proving that the constraints hold:

bzipWith :: ApplicativeB b => (forall a. f a -> g a -> h a) -> b f -> b g -> b h
bzipWithC :: forall c b f g h. (AllB c b, ConstraintsB b, ApplicativeB b) => (forall a. c a => f a -> g a -> h a) -> b f -> b g -> b h

So, we need to provide a single type class that encompasses the current series of constraints introduced in the definition of rpcDef.

To do this, we have to define a new type class that we'll call MkDef that iterates over the type-level structure of the function and proves that each argument and the return type are compatible with the codec. We've already done this in the previous article, but the "innovation" here is that we boil it all into a single type class so that it's easier to use as a constraint.

class MkDef' codec (f :: Type) (original :: Type) where
  mkDef :: Proxy f -> codec -> String -> original -> RpcDefinition

instance 
  ( FunctionSupportsCodec' IO codec original
  ) => MkDef' codec (IO result) original where
  mkDef _ codec name f = rpcDef name codec f

instance MkDef' codec b original => MkDef' codec (a -> b) original where
  mkDef _ codec name f = defFromFunction (Proxy @b) codec name f

-- This is a fun little trick that allows us to eliminate the duplication of `f` in the type signature.
class MkDef' codec f f => MkDef codec f
instance MkDef' codec f f => MkDef codec f

implementationsToDefinitions :: forall codec rec. (AllB (MkDef codec) rec, ConstraintsB rec, ApplicativeB rec) => codec -> rec Bare f -> rec Covered (Const RpcDefinition)
implementationsToDefinitions c r = bzipWithC @(MkDef codec) 
  (\(Const name) (Identity f) -> Const $ mkDef (pure f) name c f)
  rNames
  r'
  where
    -- To use most of the functions in barbies, we have to be operating in a functor context rather than a bare record context.
    -- `bcover` wraps each field in the `Identity` functor for us.
    r' = bcover r
    -- `bfieldNames` gives us a record of the field names of the record.
    rNames :: rec Covered (Const Text)
    rNames = bfieldNames

So now we can use that to generate our RegisteredRpcCalls in one go!

rpcServerHandlers :: (Traversable rec) => rec (Const RpcDefinition) -> RegisteredRpcCalls
rpcServerHandlers r = RegisteredRpcCalls $ bfoldMap (\(Const def@(RpcDefinition n _) -> Map.singleton n def)) r

myRpcServer :: RegisteredRpcCalls
myRpcServer = rpcServerHandlers $ implementationsToDefinitions rpcDefinitions

That's the server-side of things sorted.

Now how do we use this on the client side? Well, you'll need to tune in to the next article to find out. 😊

Further reading on "Higher-kinded data"

These are some of the earliest and best articles on the subject that I know of.

https://reasonablypolymorphic.com/blog/higher-kinded-data/ https://reasonablypolymorphic.com/blog/hkd-not-terrible/index.html