This blog post introduces the Haskey project, an ACID compliant embedded key-value store entirely written in Haskell, based on an MVCC B+-tree implementation. It was developed as part of the Summer of Haskell 2017. We will take a look at two libraries that we wrote this summer.
Before introducing the libraries, I must say I’m extremely grateful that I got the opportunity to be part of this year’s Summer of Haskell. I had really great fun working on this project for the last two months, even though they were incredibly challenging for me, as I had very little experience with the internals of database systems. This is entirely thanks to Steven Keuchel who not only helped me and guided me during the challenging moments, but also made sure we were able to bond outside of the project. Furthermore, I’d like to thank the people who organized SoH 2017, not only because of the opportunities they have created for me, but also because of all the work they have done. I especially want to thank Jasper, who helped me a lot with the necessary paperwork
Disclaimer: Haskey is not production ready yet. We are still actively making changes to the public API and the internals, as well as the binary format.
EDIT (9-Apr-2018): Haskey is now considered stable.
EDIT (9-Apr-2018): Note that this document gives a good overview on how the library is structured, but might use outdated APIs. See this up to date tutorial, which covers a lot of the content in this article, to get started with Haskey.
Haskey’s goal and design
The goal of Haskey is to deliver a space efficient, fast and scalable key-value store to the Haskell community, which can be embedded into Haskell projects. To achieve this, we modeled Haskey after LMDB. Our design is based on an MVCC B+-tree, which allows us to provide non-blocking concurrent read access, and serialized write access, where readers and writers do not block each other.
The past two months we have implemented this design, which resulted in two libraries, available on Hackage. The first is haskey-btree, which implements both a copy-in-place and copy-on-write B+-tree. The copy-on-write variant abstracts over an allocator, which is able to write and read B+tree nodes on demand. The second package is called haskey. This package implements such an allocator. It allocates and manages pages to write the nodes to, and it provides transaction support and concurrent access, as well as data integrity. haskey-btree modifies the tree, while haskey supports tree modifications.
The B+-tree implementation
The haskey-btree package includes two separate implementations of B+-trees. The first one can be found in the Data.BTree.Pure module, which contains a pure, standard, textbook implementation of an in-memory B+-tree of a certain minimum degree. It is a standalone implementation, not used anywhere else in the code base, but still a nice thing to have. More interesting, however, is the copy-on-write B+-tree implementation in Data.BTree.Impure.
If we look at some functions in Data.BTree.Impure that are used to modify and query the tree, we notice the following type signatures:
lookupTree :: (AllocReaderM m, Key key, Value val)
=> key
-> Tree key val-> m (Maybe val)
insertTree :: (AllocM m, Key key, Value val)
=> key
-> val
-> Tree key val
-> m (Tree key val)
deleteTree :: (AllocM m, Key key, Value val)
=> key
-> Tree key val
-> m (Tree key val)
We can see that the functions operate in either the AllocReaderM
or the AllocM
monad. Let’s take a closer look at the definitions of these type classes:
class Monad m => AllocReaderM m where
readNode :: (Key key, Value val)
=> Height height
-> NodeId height key val
-> m (Node height key val)
readOverflow :: Value val => OverflowId -> m val
-- Not all functions included...
class AllocReaderM m => AllocM m where
allocNode :: (Key key, Value val)
=> Height height
-> Node height key val
-> m (NodeId height key val)
freeNode :: Height height -> NodeId height key val -> m ()
allocOverflow :: Value val => val -> m OverflowId
freeOverflow :: OverflowId -> m ()
We can see that the AllocReaderM
type class supports operations to read B+-tree nodes and overflow values with a certain ID, while the AllocM
type class supports operations to write and free B+-tree nodes and overflow values. How do we get such an allocator? Well, the Date.BTree.Alloc.Debug module contains an in-memory allocator based on Data.Map. It is very inefficient, and solely intended for testing and debugging purposes, but it provides a good example on how to start implementing an allocator from scratch.
import Data.BTree.Alloc.Debug (Pages, runDebugT, emptyPages)
import Data.BTree.Impure (Tree, empty, insertTree)
debugAllocatorExample :: (Tree ByteString ByteString, Pages)
debugAllocatorExample = runDebugT emptyPages $ do
return empty
>>= insertTree "key1" "wrong value"
>>= insertTree "key1" "correct value"
>>= insertTree "key2" "other value"
In the code listing above you can see the debug allocator in action. For a more robust, space efficient and thread-safe allocator we have to turn to the haskey library.
Robust, thread-safe allocator
The haskey library builds upon the haskey-btree library by providing a robust, thread-safe page allocator with transaction support. It allows for multiple concurrent readers which do not block each other, and serialized write access. Furthermore, readers do not block writers, and vice versa.
The actual allocator is implemented in an internal module called Database.Haskey.Alloc.Concurrent.Monad, but you don’t really need to know about it. The exported functions in Database.Haskey.Alloc.Concurrent use this monad internally, and they are sufficient to use the allocator. The following functions are particularly interesting:
-- Simplified type signatures...
createConcurrentDb :: (ConcurrentMetaStoreM m, Key k, Value v)
=> ConcurrentHandles
-> m (ConcurrentDb k v)
openConcurrentDb :: (Key k, Value v, ConcurrentMetaStoreM m)
=> ConcurrentHandles
-> m (Maybe (ConcurrentDb k v))
transact :: (ConcurrentMetaStoreM m, Key k, Value v)
=> (forall n. AllocM n => Tree key val -> n (Transaction key val a))
-> ConcurrentDb key val
-> m a
transactReadOnly :: (ConcurrentMetaStoreM m, Key k, Value v)
=> (forall n. AllocReaderM n => Tree key val -> n a)
-> ConcurrentDb key val
-> m a
The openConcurrentDb
and createConcurrentDb
functions are used to create new and open existing databases. While the transact
and the transactReadOnly
functions are used to start read-write and read-only transactions.
We can see that the transact
and transactReadOnly
functions take a function that takes a tree root and can do AllocM
and AllocReaderM
actions. Exactly what we need to run tree modifications and queries from Data.BTree.Impure! Excellent!
But there is one more catch! We need to run these functions in a ConcurrentMetaStoreM
! Where can we find such a monad?
The storage back-end
Let’s take a closer look to the ConcurrentMetaStoreM
type class. The functions in the type class aren’t all that interesting, but the superclass is! It’s the StoreM
class:
class StoreM FilePath m => ConcurrentMetaStoreM m where
-- Omitted type class body...
-- Simplified type signatures, not all functions included...
class Monad m => StoreM hnd m | m -> hnd where
putNodePage :: (Key key, Value val)
=> hnd
-> Height height
-> NodeId height key val
-> Node height key val
-> m ()
getNodePage :: (Key key, Value val)
=> hnd
-> Height height
-> NodeId height key val
-> m (Node height key val)
putOverflow :: Value val => hnd -> val m ()
getOverflow :: Value val => hnd -> m val
We can see that the StoreM
type class simply abstracts over a storage back-end that can read and write pages and overflow values. There are two built-in storage back-ends. One is the InMemory storage back-end, which we’ll let you discover by yourself. The other one is the on-disk store in Database.Haskey.Store.File. Let’s look at the interesting functions.
newFileStore :: IO (Files fp)
runFileStoreT :: FileStoreT fp m a -- ^ Database actions
-> FileStoreConfig -- ^ Database configuration
-> Files fp -- ^ Open database files
-> m a
The newFileStore
function creates a new uninitialized state for the file storage back-end. This is a simple IORef
that can safely be accessed by concurrent threads. Those threads can safely call runFileStoreT
with that state.
The runFileStoreT
function takes a FileStoreT
action. Lucky for us, there is an instance of AllocM
and ConcurrentMetaStoreM
for FileStoreT
! This allows us to run our beloved transactions.
Full code example
We now have all the pieces to stitch together an example. Let the code do the talking! Example code is also available on GitHub1.
module Main where
import Control.Concurrent.Async (async, wait)
import Control.Monad (void, replicateM)
import Data.BTree.Impure (toList, insertTree)
import Data.ByteString (ByteString)
import Data.Int (Int32)
import Data.Text.Encoding (encodeUtf8)
import qualified Data.Text as Text
import Database.Haskey.Alloc.Concurrent (ConcurrentDb,
ConcurrentHandles,
concurrentHandles,
openConcurrentDb,
createConcurrentDb,
transact_,
transactReadOnly,
commit_)
import Database.Haskey.Store.File (FileStoreT, Files, newFileStore,
runFileStoreT, defFileStoreConfig)
main :: IO ()
main = do
store <- newFileStore
db <- openOrCreate store
writers <- mapM (async . writer store db) [1..100]
readers <- replicateM 100 $ async (reader store db)
mapM_ wait writers
mapM_ wait readers
putStrLn "Done"
writer :: Files FilePath
-> ConcurrentDb Int32 ByteString
-> Int32
-> IO ()
writer store db i =
runDatabase store $ transact_ tx db
where
bs = encodeUtf8 $ Text.pack (show i)
tx tree = insertTree i bs tree >>= commit_
reader :: Files FilePath
-> ConcurrentDb Int32 ByteString
-> IO ()
reader files db = void $ replicateM 100 $ runDatabase files $
transactReadOnly toList db
openOrCreate :: Files FilePath
-> IO (ConcurrentDb Int32 ByteString)
openOrCreate store = runDatabase store $ do
maybeDb <- openConcurrentDb handles
case maybeDb of
Nothing -> createConcurrentDb handles
Just db -> return db
runDatabase :: Files FilePath
-> FileStoreT FilePath m a
-> m a
runDatabase files action = runFileStoreT action defFileStoreConfig files
handles :: ConcurrentHandles
handles = concurrentHandles "example-database.haskey"
Note that at the moment concurrent readers are disabled, because of a race condition in the code. However, the code example will work.↩