There and Back Again: Turning Haskell functions into RPC calls- Part 2

This post follows from There and Back Again: Turning Haskell functions into RPC calls- Part 1.

In the last post, we saw how to gather function arguments, serialize them, and use a remote reference to a Haskell function to send the data somewhere.

Let's see how to handle the server side. From the last article, we set out to call the following function from another process:

echo :: String -> IO String
echo = pure

For a remote reference to the function, we came up with the following interface:

data RpcRef (args :: [Type]) result = forall codec.
  ( Codec codec result
  , GatherArgs codec args
  ) => RpcRef
    { rpcRefName :: Text
    , rpcRefCodec :: codec
    }

rpcRef :: Text -> codec -> f -> RpcRef (ArgsOf f) (ResultOf IO f)
rpcRef name fmt = RpcRef name fmt

runActivity :: RpcRef args result -> (args :->: IO result)
runActivity = ...

echoRef :: RpcRef '[String] String
echoRef = rpcRef "echo" stringCodec echo

We know two things up-front about the design of the server-side handling mechanism:

  1. The function needs to have the same name and codec as the client-side caller.
  2. When we register multiple callable functions, the underlying argument and result types can be different.

Therefore, we need to encapsulate any differing type information into something opaque, but callable given a series of payloads.

data RpcDefinition = RpcDefinition
  { rpcDefName :: Text
  , rpcDefHandler :: Vector RawPayload -> Either String (IO RawPayload)
  }

The handler here returns an either outside of performing IO, because we can't actually call the underlying function in the case that the arguments are invalid, there aren't enough of them, etc. We'll see how to handle this later.

Let's figure out how to apply a series of payloads to a function. This is roughly the opposite problem to the GatherArgs typeclass we defined in the last post.

class ApplyPayloads codec (args :: [Type]) where
  applyPayloads 
    :: codec -- ^ We need to know how to decode the result
    -> Proxy args -- ^ We need to know the argument types
    -> Proxy result -- ^ We need to witness the result type here to avoid impredicativity issues when using :->:
    -> (args :->: result) -- ^ This is the actual RPC function implementation
    -> V.Vector RawPayload -- ^ The payloads to apply as arguments
    -> Either String result -- ^ The result of applying the payloads, or an error message

The base case is pretty simple: if there are no arguments we have the result.

instance ApplyPayloads codec '[] where
  applyPayloads _ _ _ f _ = Right f

The recursive case is pretty simple too. We decode the next payload, apply it to the function, and recurse.

instance (Codec codec ty, ApplyPayloads codec tys) => ApplyPayloads codec (ty ': tys) where
  applyPayloads codec _ resP f vec = case V.uncons vec of
    Nothing -> Left "Not enough arguments"
    Just (pl, rest) -> case decode codec pl of
      Right arg -> applyPayloads codec (Proxy @tys) resP (f arg) rest
      Left err -> Left err

This gives us what we need to implement a smart constructor for RpcDefinition:

rpcDef :: forall codec args result f.
  ( ApplyPayloads args (IO result)
  , Codec codec result
  , args ~ ArgsOf f
  , result ~ ResultOf IO f
  , f ~ (args :->: IO result) 
  -- ^ this one is a bit silly, but it helps GHC understand that f and the 
  -- reconstructed version of f using (:->:) are the same thing.
  ) => Text -> Codec -> f -> RpcDefinition
rpcDef name codec f = RpcDefinition name $ \vec -> do
  f' <- applyPayloads codec (Proxy @args) (Proxy @(IO result)) f vec
  fmap (encode codec) <$> f'

Now that we have a nicely packaged up RpcDefinition, invoking it is pretty simple. We keep a map of RpcDefinitions, and when we get a request, we look up the definition by name and apply the payloads.


data RegisteredRpcCalls = RegisteredRpcCalls
  { rpcCalls :: Map Text RpcDefinition
  }

handleRpcCall :: RegisteredRpcCalls -> Text -> Vector RawPayload -> Either String (IO RawPayload)
handleRpcCall (RegisteredRpcCalls calls) name payloads = case M.lookup name calls of
  Nothing -> Left $ "No RPC call registered with name " <> name
  Just def -> rpcDefHandler def payloads

The transport layer for this is really up to you. You could use a TCP socket, a web server, or even a message queue. The fun bit for this post is focused on serialization.

One last thing we can do to tidy this up and improve safety a bit is to provide a function that defines the reference and the definition at the same time. This helps us avoid typos in the name, and ensures that the codec is the same on both sides.

mkRpcCall ::  forall codec args result f.
  ( ApplyPayloads args (IO result)
  , GatherArgs codec args
  , Codec codec result
  , args ~ ArgsOf f
  , result ~ ResultOf IO f
  , f ~ (args :->: IO result) 
  -- ^ this one is a bit silly, but it helps GHC understand that f and the 
  -- reconstructed version of f using (:->:) are the same thing.
  ) => Text -> codec -> f -> (RpcRef args result, RpcDefinition)
mkRpcCall name codec f = (ref, def)
  where
    ref = rpcRef name codec f
    def = rpcDef name codec f

Now we can define our echo function and its reference in one go:

(echoRef, echoDef) = mkRpcCall "echo" stringCodec echo

Hopefully this gives you a sense of how to build a simple RPC system in Haskell. There are a few things we haven't covered here– like how to handle errors, how to clean up some of the noisy constraint sections of function definitions, and so forth. To see an actual implementation of the ideas here and how these problems are handled, take a look at the source code of hs-temporal-sdk.

Tune in next time for a discussion of how to integrate Haskell with large, async-heavy Rust libraries!