Integrating a large Tokio-based Rust library with Haskell
One of the hallmarks of Rust is its relative ease of interoperability with other languages. Python, Ruby, and Node.js all have fairly robust interoperability bridges with Rust. Haskell, on the other hand, currently has a fairly limited set of options for complex integration. Sure, there are examples of calling Rust from Haskell, but they tend to be small, synchronous libraries.
That level of support is great for projects that are already written in Haskell and just need to leverage some Rust code for narrow performance-sensitive problems like compression, encryption, or parsing. But what if you want to integrate with a large Rust library that uses more advanced features like async/await? That takes a bit more elbow grease. Let's take a look at some of the challenges and solutions for how to integrate a Tokio-based Rust library with Haskell.
In my case, I am working on a Haskell library for the Temporal workflow engine. Temporal is an orchestration engine that allows you to write long-running, stateful, distributed applications. The neat thing is that it is built so that you can use your language of choice to write workflow code– but the guarantees it provides make writing a client per language quite difficult.
In order to alleviate the burden of writing a client per language, Temporal provides an sdk-core
project that abstracts away much
of the complexity of the Temporal protocol. The sdk-core
project is written in Rust and uses Tokio for its async runtime. Clients
for each language both call into the Rust library and may be reactivated by the Rust library when a blocking call is resumed.
At a high level, the GHC multithreaded runtime and Tokio are very similar. Both use a thread pool to execute tasks. Both maintain a run queue of tasks that want to be executed, and have a method to suspend and resume tasks as they become blocked and unblocked. In theory, this should allow us to integrate the two runtimes fairly easily. In practice, we need to do a bit of work to make sure that cooperatively suspending and resuming between the two doesn't incur too much overhead.
On the Haskell side, the cheapest mechanism that we have for intentionally suspending a thread is an MVar
. It is a synchronizing
mutable variable that blocks the thread when it is empty and unblocks it when it is full. They are often used to lock access to
a shared resource even within native Haskell code.
Naively the easiest way to suspend and resume a Haskell thread from a foreign library would be to do something like this:
import Control.Concurrent.MVar
import Foreign.Ptr
import Foreign.Storable
type ResultCallback a = Ptr a -> IO ()
foreign import ccall "wrapper"
wrap_resumeHaskellWithResult :: ResultCallback a -> IO (FunPtr (ResultCallback a))
data BlockingThingResultStruct = BlockingThingResultStruct
{ result :: Int
}
instance Storable BlockingThingResultStruct where
sizeOf _ = 8
alignment _ = 8
peek ptr = do
result <- peekByteOff ptr 0
pure BlockingThingResultStruct { result }
poke ptr BlockingThingResultStruct { result } = do
pokeByteOff ptr 0 result
foreign import ccall "do_blocking_thing"
do_blocking_thing :: FunPtr (ResultCallback a) -> IO ()
doBlockingThing :: IO a
doBlockingThing = do
mvar <- newEmptyMVar
callback <- wrap_resumeHaskellWithResult $ \result -> do
putMVar mvar result
do_blocking_thing callback
rawResult <- takeMVar mvar
freeHaskellFunPtr callback
peek rawResult
Here, we are allocating an MVar
, creating a dynamic callback that will put the result into the MVar
, and
then passing the callback to the Rust library. The Rust library will call the callback when it has completed a blocked operation,
and from there the Haskell code can continue on its merry way. This is a pretty common pattern for Haskell FFI code, and it works.
However, the bookkeeping that GHC needs to do when wrapping a callback is not free. In fact, it is quite expensive. Calling back into Haskell too, is pretty pricy. Creating a one-shot callback for every blocking operation is perhaps the easiest thing to do, but for frequent blocking operations, it is not the most efficient.
Luckily, GHC has a mechanism that lets us avoid using callbacks entirely. The GHC runtime provides a C function called
hs_try_putmvar
that allows a foreign library to resolve an MVar
by placing ()
into it. This is a much cheaper operation, as
it does not require any bookkeeping on the Haskell side. It simply enqeues the thread to be resumed on the next available RTS context
switch. If we were to write the function signature in Haskell, it would be MVar () -> IO ()
.
Okay, that's great, but how do we use it if the MVar
doesn't carry any information in it? The key is that we only use the MVar
as a synchronizing barrier. We can allocate memory in Haskell that the Rust library can write to, and then use read the result out
once the MVar
is resolved:
import GHC.Conc (newStablePtrPrimMVar, PrimMVar)
makeExternalCall = mask_ $ do
mvar <- newEmptyMVar
sp <- newStablePtrPrimMVar mvar
fp <- mallocForeignPtr
withForeignPtr fp $ \presult -> do
cap <- threadCapability =<< myThreadId
scheduleCallback sp cap presult
takeMVar mvar `onException`
forkIO (do takeMVar mvar; touchForeignPtr fp)
peek presult
foreign import ccall "scheduleCallback"
scheduleCallback :: StablePtr PrimMVar
-> Int
-> Ptr Result
-> IO ()
Sweet, now we have a way to call into Rust without using callbacks. How do we wire this up on the Rust side? Let's start by defining the basic constructs that mirror the Haskell side:
#[repr(C)]
pub struct MVar {
_data: [u8; 0],
_marker:
core::marker::PhantomData<(*mut u8, core::marker::PhantomPinned)>,
}
#[repr(C)]
pub struct Capability {
pub cap_num: c_int
}
#[link(name ="HSrts", kind="dylib")]
extern "C" {
pub fn hs_try_putmvar(capability: Capability, mvar: *mut MVar);
}
Here, we're defining a Rust struct that is an opaque representation of an MVar
on the Haskell side. For the purposes of Rust
integration, we don't need to know anything about the MVar
other than its address. We also define a Capability
struct that
is a newtype of a c_int
. When we make a call into Rust from Haskell that is going to block, we pass in the Capability
that
the MVar
is going to suspend on. A capability in GHC RTS parlance is a thing that holds all the state an OS thread/task needs
to run Haskell code. This allows us to provide a hint to the GHC runtime about what Haskell thread to resume once the
MVar
is resolved.
Now that we have the basic constructs, let's think about how to map a Rust Future
to Haskell-side results.
A Future
is a type that represents a value that will be available at some point in the future. It is a bit like a lazy IO
action in Haskell. It has the ability to resolve with a value or an error. Conceptually, as long as we know how to convert the
value and error types to and from pointers, we can use turn a future Result<A, E>
into a Haskell Either E A
. To make it
easier to carry around the MVar
and the result/error pointers, let's make a happy little struct:
pub struct HsCallback<A, E> {
pub cap: Capability,
pub mvar: *mut MVar,
pub result_slot: *mut*mut A,
pub error_slot: *mut*mut E,
}
Now, we have a small problem issue to resolve:
Some of the types we want to pass into Haskell are going to be data that we
want to turn into a Haskell equivalent, such as numbers, strings, boolean values, etc.,
but other types are going to be opaque pointers / handles to Rust data structures that we use to
operate the sdk-core
library via FFI calls. We need a way to convert both of these types into
a pointer that we can pass into Haskell, but they have fairly different memory management needs.
Sonos has a really handy Rust library called ffi_convert
that provides a number of utilities
that make it easy to convert Rust types to and from C-compatible structures. Here, we leverage
the RawPointerConverter
trait to signify that as long as we know how to put a type into a pointer,
we can pass it into Haskell:
use ffi_convert::*;
use std::future::Future;
impl <A, E> HsCallback<A, E> {
pub fn put_success(self, result: A)
where
A: RawPointerConverter<A>,
{
unsafe {
*self.result_slot = result.into_raw_pointer_mut();
*self.error_slot = std::ptr::null_mut();
hs_try_putmvar(self.cap, self.mvar);
}
}
pub fn put_failure(self, error: E)
where
E: RawPointerConverter<E>,
{
unsafe {
*self.error_slot = error.into_raw_pointer_mut();
*self.result_slot = std::ptr::null_mut();
hs_try_putmvar(self.cap, self.mvar);
}
}
pub fn put_result(self, result: Result<A, E>)
where
A: RawPointerConverter<A>,
E: RawPointerConverter<E>,
{
match result {
Ok(result) => self.put_success(result),
Err(error) => self.put_failure(error),
}
}
}
impl Runtime {
pub fn future_result_into_hs<F, T, E>(&self, callback: HsCallback<T, E>, fut: F)
where
F: Future<Output = Result<T, E>> + Send + 'static,
T: RawPointerConverter<T>,
E: RawPointerConverter<E>
{
let handle = self.core.tokio_handle();
let _guard = handle.enter();
let result = handle.block_on(fut);
callback.put_result(result);
}
}
That's us sorted on the Rust side. When our Rust library wants to call into Haskell, it can use the future_result_into_hs
method on the Runtime
struct to convert the outcome of a future into a success or failure pointer, and then resolve
the MVar
once we have some data to work with.
Now, let's take a look at how we can use this on the Haskell side. Similarly to RawPointerConverter
on the Rust side, we
need to introduce a notion of a mapping between the raw data that we get from Rust and the Haskell types that we want to use.
class ManagedRustValue r where
type RustRef r :: Type
type HaskellRep r :: Type
fromRust :: proxy r -> RustRef r -> IO (HaskellRep r)
data CArray a = CArray
{ cArrayPtr :: Ptr a
, cArrayLen :: CSize
}
instance ManagedRustValue (CArray Word8) where
type RustRef (CArray Word8) = Ptr (CArray Word8)
type HaskellRep (CArray Word8) = ByteString
fromRust _ rustPtr = mask_ $ do
(CArray bytes len) <- peek rustPtr
bs <- ByteString.packCStringLen (castPtr bytes, fromIntegral len)
rust_drop_byte_array rustPtr
pure bs
Why do we need this? In the case of primitive types like String
, we want to copy the data out of the Rust memory and into
memory that is managed by the Haskell runtime. After that, we go ahead and free the Rust memory in the fromRust
function so
that we don't leak memory. In the case of handle-esque types, we need to keep the value on Rust side alive to do anything meaningful.
For these types, we instead want the ability to put manage the result using a ForeignPtr
. ForeignPtr
allows us to associate finalizers
with a pointer, so that when the pointer is garbage collected in Haskell, we can call the Drop
implementation on the Rust side. This
requires us to write a Rust FFI wrapper for each type that we want to use in Haskell that we then have to import on the Haskell side,
but it greatly simplifies the memory management story.
This brings us to the final piece of the puzzle, initiating a call from Haskell that uses Tokio:
newtype TokioResult a = TokioResult (Ptr (RustRef a))
withTokioResult :: (RustRef a ~ Ptr a) => (TokioResult a -> IO b) -> IO b
withTokioResult f = alloca $ \ptr -> do
poke ptr nullPtr
f (TokioResult ptr)
peekTokioResult :: (ManagedRustValue a, RustRef a ~ Ptr a) => TokioResult a -> (RustRef a -> IO b) -> IO (Maybe b)
peekTokioResult (TokioResult ptr) f = do
inner <- peek ptr
if (inner == nullPtr)
then return Nothing
else Just <$> f inner
type TokioCall e a = StablePtr PrimMVar -> Int {- the capability -} -> TokioResult e -> TokioResult a -> IO ()
-- | Dropping can't be done automatically if the result is returned without async exceptions
-- intervening, because we don't want to drop things like `Client` while they're still in use.
-- So we should return ForeignPtrs for things that need to stay alive, and then we can drop when we're done.
makeTokioAsyncCall :: (ManagedRustValue e, RustRef e ~ Ptr e, ManagedRustValue a, RustRef a ~ Ptr a)
=> TokioCall e a
-> (RustRef e -> IO f)
-> (RustRef a -> IO b)
-> IO (Either f b)
makeTokioAsyncCall call readErr readSuccess = mask_ $ do
mvar <- newEmptyMVar
sp <- newStablePtrPrimMVar mvar
withTokioResult $ \errorSlot -> withTokioResult $ \resultSlot -> do
let peekEither = do
e <- peekTokioResult errorSlot readErr
case e of
Nothing -> do
r <- peekTokioResult resultSlot readSuccess
case r of
Nothing -> error "Both error and result are null"
Just r -> return (Right r)
Just e -> return (Left e)
(cap, _) <- threadCapability =<< myThreadId
call sp cap errorSlot resultSlot
() <- takeMVar mvar `onException`
forkIO (takeMVar mvar >> void peekEither)
peekEither
There we have it! We can now call into Rust from Haskell, free memory at the appropriate times, and cooperatively
interoperate between the GHC and Tokio runtimes. Hopefully the ideas here provide a useful starting point for
your own Rust-Haskell interop adventures. If you want to see a more complete example, check out the
Temporal Haskell SDK– it also includes a custom Setup.hs
configuration that uses cargo
to build the Rust library and link against it, as well as a basic Nix configuration.