Software Transactional Memory
Matthew Wraith
@wraith_m
PeerTrader
peertrader.com
@Peer_Trader
TVars, TChans, and more
Chicago Haskell
chicagohaskell.com
@ChicagoHaskell
#chicagohaskell @ Freenode
Please ask questions!
Lots of inspiration from
Simon Peyton Jones' talks on STM
Poll
- Do you know Haskell?
- Software Transactional Memory (STM)?
- Heard of it?
- Written code with it?
- Both?
Outline
- Parallelism vs concurrency
- How STM fits
- IORefs (little detour)
- TVars
- TChans
- Publish-Subscribe with STM
- Summary
Parallelism
Concurrency
- The problem is naturally expressed as computations happening at the same time
- Threads, lots of IO
- Non-deterministic
- Many techniques:
- STM
- MVars
- Cloud Haskell
- Speed up sequential computations
- Think GPUs, big matrix multiplication, multicore
- Deterministic
- Many techniques:
- Par
- Accelerate
- Repa
Haskell makes a distinction between
Software Transactional Memory
- Shared memory across multiple threads
- Works with or without multi-core
- Was invented in Java land
- Really clever in Haskell because types
- Typical solution for similar problems (Java/C++):
- Thread pools
- Event handlers
- Locks and condition variables
- Usually, an absurdly hard problem:
- Race conditions
- Deadlocks
- Lost wake-ups
- Error handling
IO and IORefs
- Millions of threads all doing IO
- IO is explicit in the type system
putStrLn :: String -> IO ()
IO Refs
newIORef :: a -> IO (IORef a)
readIORef :: IORef a -> IO a
writeIORef :: IORef a -> a -> IO ()
- Like a pointer in C
- Little bit clumsier
main :: IO ()
main = do
counterRef <- newIORef 0
incRef counterRef
counter <- readIORef counterRef
print counter
incRef :: IORef Int -> IO ()
incRef r = do
x <- readIORef r
writeIORef r (x + 1)
Can't do (r + 1) directly
(+) :: Num a => a -> a -> a
r :: IORef Int
Concurrency with IORefs
- How do threads coordinate?
- IORef is really tricky
- Locks, condition variables
- Same problems
- Races, deadlock, lost wake-up
- Errors?
- Need an abstraction!
forkIO :: IO () -> IO ThreadId
main :: IO ()
main = do
counterRef <- newIORef 0
forkIO (incRef counterRef)
incRef counterRef
counter <- readIORef counterRef
print counter
Race!
main :: IO ()
main = do
counterRef <- newIORef 0
forkIO (atomically (incRef counterRef))
atomically (incRef counterRef)
counter <- atomically (readIORef counterRef)
print counter
atomically :: IO a -> IO a
- All or nothing commit
- Basically, write sequential code, wrap atomically around it
- Errors are simple again
- What stops a programmer from using incRef outside of an atomic block?
Atomic Transactions in Haskell
Problem!
Transactions must be reversible!
Typical solution: Social contract
-- Aww yiss
newTVar :: a -> STM (TVar a)
readTVar :: TVar a -> STM a
writeTVar :: TVar a -> a -> STM ()
atomically :: STM a -> IO a
- All or nothing commit
- Can't deadlock! No locks!
- Errors are simple again
- Cannot execute STM code outside of atomic block
- Cannot execute arbitrary IO inside of atomic block
Atomic Transactions with STM
main :: IO ()
main = do
counterRef <- atomically (newTVar 0)
forkIO $ atomically (incRef counterRef)
count <- atomically (readTVar counterRef)
print count
incRef :: TVar Int -> STM ()
incRef r = do
x <- readTVar r
writeTVar r (x + 1)
Composes beautifully
type Amount = Int
type Account = TVar Amount
withdraw :: Amount -> Account -> STM ()
deposit :: Amount -> Account -> STM ()
transfer :: Amount -> Account -> Account -> STM ()
transfer amt acct1 acct2 = do
withdraw amt acct1
deposit amt acct2
Simply do a bunch of STM stuff, then wrap atomically around it at the end
Composes beautifully
type Amount = Int
type Account = TVar Amount
modifyTVar :: (a -> a) -> TVar a -> STM ()
modifyTVar f t = do
x <- readTVar t
writeTVar t (f x)
withdraw :: Amount -> Account -> STM ()
withdraw amt = modifyTVar (\oldAmt -> oldAmt - amt)
deposit :: Amount -> Account -> STM ()
deposit amt = modifyTVar (+ amt)
transfer :: Amount -> Account -> Account -> STM ()
transfer amt acct1 acct2 = do
withdraw amt acct1
deposit amt acct2
Simply do a bunch of STM stuff, then wrap atomically around it at the end
Composes beautifully
- Transactions are first class
- retry, orElse, always
- Condition variables, but much easier and happier
- Forms a MonadPlus
- Read: Beautiful Concurrency by SPJ
Being abstract is something profoundly different from being vague... The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise. -Edsger Dijkstra
Composes beautifully
always :: STM Bool -> STM ()
orElse :: STM a -> STM a -> STM a
retry :: STM a
instance MonadPlus STM where
mzero = retry
mplus = orElse
So let's abstract even more! TChans
newTChan :: STM (TChan a)
readTChan :: TChan a -> STM a
writeTChan :: TChan a -> a -> STM ()
- Unbounded FIFO queue
- STM Linked-list of TVars
receiveRequests :: Conn -> TChan Request -> IO ()
receiveRequests conn requests = forever $ do
req <- acceptRequest conn
atomically (writeTChan requests req)
processRequests :: TChan Request -> IO ()
processRequests requests = forever $ do
req <- atomically (readTChan requests)
respond req
main :: IO ()
main = do
conn <- newConnection
requests <- atomically newTChan
replicateM_ 10 $ forkIO (processRequests requests)
receiveRequests conn requests
Publish-Subscribe with STM
newBroadcastTChan :: STM (TChan a)
dupTChan :: TChan a -> STM (TChan a)
Publish-Subscribe with STM
type In t a = TChan (t, a)
type Out a = TChan a
newFeed :: IO (In t a)
publish :: t -> a -> In t a -> IO ()
subscribe :: Eq t => t -> In t a -> IO (Out a)
readFeed :: Out a -> IO (Maybe a)
- Topic-based Publish-Subscribe
- Topics are named logical channels
- Subscribers receive all messages published to their subscribed topics
- All subscribers receive the same messages
- Broadcast
Publish-Subscribe with STM
type In t a = TChan (t, a)
type Out a = TChan a
newFeed :: IO (In t a)
newFeed = atomically newBroadcastTChan
publish :: t -> a -> In t a -> IO ()
publish topic x chan = atomically (writeTChan chan (topic, x))
subscribe :: Eq t => t -> In t a -> IO (Out a)
subscribe topic chan = do
dupedChan <- atomically (dupTChan chan)
out <- newTChanIO
_ <- forkIO . forever . atomically $ do
(t, x) <- readTChan dupedChan
when (topic == t) $ writeTChan out x
return out
-- tryReadTChan :: TChan a -> STM (Maybe a)
readFeed :: Out a -> IO (Maybe a)
readFeed out = atomically (tryReadTChan out)
Publish-Subscribe with STM
data Topic = Fizz | Buzz | Bazz
deriving (Show, Eq)
executeTen :: IO () -> IO ThreadId
executeTen = forkIO . replicateM_ 10
main :: IO ()
main = do
inChan <- newFeed
out1 <- subscribe Fizz inChan
out2 <- subscribe Buzz inChan
executeTen (publish Fizz "Fizz" inChan)
executeTen (publish Buzz "Buzz" inChan)
executeTen (publish Bazz "Bazz" inChan)
forever $ do
x1 <- readFeed out1
x2 <- readFeed out2
print (x1, x2)
type In t a = TChan (t, a)
type Out a = TChan a
newFeed :: IO (In t a)
publish :: t -> a -> In t a -> IO ()
subscribe :: Eq t => t -> In t a -> IO (Out a)
readFeed :: Out a -> IO (Maybe a)
subscribeMultiple :: Eq t => [t] -> In t a -> IO (Out a)
(Just "Fizz",Just "Buzz")
(Just "Fizz",Just "Buzz")
(Just "Fizz",Just "Buzz")
(Just "Fizz",Just "Buzz")
(Just "Fizz",Just "Buzz")
(Just "Fizz",Just "Buzz")
(Just "Fizz",Just "Buzz")
(Just "Fizz",Just "Buzz")
(Just "Fizz",Just "Buzz")
(Just "Fizz",Just "Buzz")
(Nothing,Nothing)
(Nothing,Nothing)
(Nothing,Nothing)
(Nothing,Nothing)
(Nothing,Nothing)
...
Publish-Subscribe with STM
newtype In t a = In (Map t (TChan a))
newtype Out a = Out (TChan a)
newFeed :: Ord t => [t] -> IO (In t a)
addTopic :: Ord t => t -> In t a -> IO (In t a)
publish :: Ord t => t -> a -> In t a -> IO (In t a)
subscribe :: Ord t => t -> In t a -> IO (Maybe (Out a))
subscribeMultiple :: Ord t => [t] -> In t a -> IO (Maybe (Out a))
readFeed :: Out a -> IO (Maybe a)
(Closer to what I actually do)
Publish-Subscribe with STM
import Data.Map as M
newtype In t a = In (Map t (TChan a))
newtype Out a = Out (TChan a)
newFeed :: Ord t => [t] -> IO (In t a)
newFeed = foldM (flip addTopic) (In M.empty)
addTopic :: Ord t => t -> In t a -> IO (In t a)
addTopic topic (In m) = do
tchan <- newBroadcastTChanIO
return (In (M.insert topic tchan m))
publish :: Ord t => t -> In t a -> a -> IO (In t a)
publish topic x i@(In m) = go (M.lookup topic m)
where
go (Just tchan) = do
atomically $ writeTChan tchan x
return i
go Nothing = do
newi <- addTopic topic i
publish topic newi x
(Closer to what I actually do)
Publish-Subscribe with STM
import Data.Set as S
newtype In t a = In (Map t (TChan a))
newtype Out a = Out (TChan a)
subscribe :: Ord t => t -> In t a -> IO (Maybe (Out a))
subscribe topic (In m) =
maybe (return Nothing) giveOut (M.lookup topic m)
where
giveOut tchan = do
outChan <- atomically (dupTChan tchan)
return (Just (Out outChan))
subscribeMultiple :: Ord t => [t] -> In t a -> IO (Maybe (Out a))
subscribeMultiple topics (In m) = ifAllTopicsExist $ do
out <- newTChanIO
dts <- dupedtchans
void . forkIO . void $ mapConcurrently (interleaveTChan out) dts
return (Just (Out out))
where
allTopicsExist = S.fromList topics `S.isSubsetOf` keysSet m
ifAllTopicsExist k =
if allTopicsExist
then k
else return Nothing
dupedtchans = mapM (atomically . dupTChan) tchans
tchans = M.foldlWithKey' getTChan [] m
getTChan oldchans topic newchan
| topic `elem` topics = newchan : oldchans
| otherwise = oldchans
interleaveTChan out tchan = forever . atomically $ do
x <- readTChan tchan
writeTChan out x
(Closer to what I actually do)
Summary
- Common pitfalls
- Long transactions
- Many short transactions
- Still way harder than sequential code
- Massive improvement over locks and condition vars
- Abstractions work
- Publish-subscribe is just a bunch of TVars
- So many different paradigms in Haskell
- STM is only one example of concurrency
The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise.
-Edsger Dijkstra/Gordon Freeman
Thank you!
Questions?
Chicago Haskell
chicagohaskell.com
@ChicagoHaskell
#chicagohaskell @ Freenode
- Beautiful Concurrency
- http://research.microsoft.com/en-us/um/people/simonpj/papers/stm/beautiful.pdf
- The Future is Parallel, the Future of Parallel is Declarative
- https://yow.eventer.com/events/1004/talks/1055
- Haskell and Transactional Memory
- http://research.microsoft.com/en-us/um/people/simonpj/papers/stm/STMTokyoApr10.pdf
Resources
STM Tutorial
By wraithm
STM Tutorial
- 5,359