Building a web library using super hard Haskell

Marcin Rzeźnicki

  • (disillusioned) Scala programmer till the beginning of 2020
  • started learning Haskell in January
  • in the trenches since almost the beginning of the century :-)
  • three libs on Hackage/Stackage: libjwt-typed, stackcollapse-ghc and hspec-tables

What is hard Haskell? Opposite of simple Haskell!

The Simple Haskell initiative (https://www.simplehaskell.org)


- There are as many opinions on what constitutes Simple Haskell as there are people who have them.

- (...) the value that Haskell98 has to offer through features like:

  • Pure Functions
  • Sum and Product Types
  • Type Classes
  • Polymorphism

You get most of the benefit just using newtypes, data, and functions - Don Stewart

What is hard Haskell? Opposite of simple Haskell!

Hard Haskell must therefore constitute of:

  • GADTs
  • Higher-rank polymorphism
  • Generics
  • Type applications (only non-ambiguous types!)
  • Type kinds and datatype promotions
  • Type-level programming of any kind: type/data families, type-level literals, equality constraints (!)
  • Kind polymorphism
  • FFI (!)

 

Are these that hard? 🤔

(Thinking with) types got you

You can honestly stop watching this talk and go read it

HList in Scala vs Haskell

sealed trait HList extends Product with Serializable

final case class ::[+H, +T <: HList](head : H, tail : T) extends HList

sealed trait HNil extends HList {
  def ::[H](h: H): H :: HNil = new ::(h, this)
}

case object HNil extends HNil
data HList (l::[*]) where
    HNil  :: HList '[]
    HCons :: e -> HList l -> HList (e ': l)

Problem: JWT

Problem: JWT

Problem: JWT

Problem: JWT

Why is (Map Text Value) bad?

  • dependency on Aeson
  • wrong abstraction - raw JSON instead of logical value
  • I know what's in there!
  • I know the types!
  • I have to write the code to decode it ...
  • and deal with the possible errors

Problem: JWT

Design a solution

  • replace Map with more expressive type(s)
  • use FFI for low level stuff (encryption, JSON)
  • eliminate low-level details from user's code

 

You can't do that in "Simple" Haskell.

FFI == impure; no  Map == type-level hackery

 

Is it too hard/fancy? 🤔

Problem: No FFI

If you stick to "no FFI", you won't be able to:

  •  use a lot of battle-tested libraries,
  •  gain performance in crucial parts

 

Instead:

  •  use libjwt (hard stuff is coded for me - it uses GnuTLS/OpenSSL),
  •  use jsmn (world's fastest JSON parser).

 

Calling C from Haskell is easy and quite seamless!

Cheatsheet: FFI

library
  import:              common-options
  hs-source-dirs:      src
  include-dirs:        src/cbits/jsmn
  c-sources:           src/cbits/HsJsonTokenizer.c
  extra-libraries:     jwt

Cheatsheet: FFI / bindings

{-# LANGUAGE ForeignFunctionInterface #-}
import           Foreign
import           Foreign.C.Types
import           Foreign.C.String

-- | Wrapped pointer to /jwt_t/ with managed lifetime
newtype JwtT = JwtT (ForeignPtr JwtT)

type PJwtT = Ptr JwtT

foreign import ccall unsafe "jwt.h jwt_new" c_jwt_new :: Ptr PJwtT -> IO CInt
foreign import ccall unsafe "jwt.h &jwt_free" p_jwt_free :: FunPtr (PJwtT -> IO ())
foreign import ccall unsafe "jwt.h jwt_add_grant" c_jwt_add_grant :: PJwtT -> CString -> CString -> IO CInt
foreign import ccall unsafe "jwt.h jwt_add_grant_bool" c_jwt_add_grant_bool :: PJwtT -> CString -> CInt -> IO CInt
foreign import ccall unsafe "jwt.h jwt_add_grant_int" c_jwt_add_grant_int :: PJwtT -> CString -> CLong -> IO CInt
foreign import ccall unsafe "jwt.h jwt_add_grants_json" c_jwt_add_grants_json :: PJwtT -> CString -> IO CInt
foreign import ccall unsafe "jwt.h jwt_get_grant" c_jwt_get_grant :: PJwtT -> CString -> IO CString
foreign import ccall unsafe "jwt.h jwt_get_grant_bool" c_jwt_get_grant_bool :: PJwtT -> CString -> IO CInt
foreign import ccall unsafe "jwt.h jwt_get_grant_int" c_jwt_get_grant_int :: PJwtT -> CString -> IO CLong
foreign import ccall unsafe "jwt.h jwt_get_grants_json" c_jwt_get_grants_json :: PJwtT -> CString -> IO CString
foreign import ccall unsafe "jwt.h jwt_set_alg" c_jwt_set_alg :: PJwtT -> JwtAlgT -> CString -> CInt -> IO CInt
foreign import ccall unsafe "jwt.h jwt_add_header" c_jwt_add_header :: PJwtT -> CString -> CString -> IO CInt
foreign import ccall unsafe "jwt.h jwt_encode_str" c_jwt_encode_str :: PJwtT -> IO CString
foreign import ccall unsafe "jwt.h jwt_get_alg" c_jwt_get_alg :: PJwtT -> IO JwtAlgT
foreign import ccall unsafe "jwt.h jwt_get_header" c_jwt_get_header :: PJwtT ->  CString -> IO CString
foreign import ccall unsafe "jwt.h jwt_decode" c_jwt_decode :: Ptr PJwtT -> CString -> CString -> CInt -> IO CInt

wrapPtr :: PJwtT -> IO JwtT
wrapPtr = coerce . newForeignPtr p_jwt_free

Cheatsheet: FFI / hsc

{-# LANGUAGE CPP, ForeignFunctionInterface #-}
{-# LANGUAGE DerivingStrategies #-}

import Foreign.C.Types

newtype JwtAlgT = JwtAlg CInt
  deriving stock Eq

#include <jwt.h>

#{enum JwtAlgT, JwtAlg
 , JWT_ALG_NONE
 , JWT_ALG_HS256
 , JWT_ALG_HS384
 , JWT_ALG_HS512
 , JWT_ALG_RS256
 , JWT_ALG_RS384
 , JWT_ALG_RS512
 , JWT_ALG_ES256
 , JWT_ALG_ES384
 , JWT_ALG_ES512
 , JWT_ALG_TERM 
 }

Cheatsheet: FFI / glue

addHeader :: String -> ByteString -> JwtT -> JwtIO ()
addHeader h val (JwtT pjwt_t) =
  JIO $ useAsCString val $ \p_val -> withForeignPtr pjwt_t $ \jwt ->
    withCAString h $ \ph ->
      throwIfNonZero_ "jwt_add_header" $ c_jwt_add_header jwt ph p_val

jwtEncode :: JwtT -> JwtIO ByteString
jwtEncode (JwtT pjwt_t) =
  JIO
    $   withForeignPtr pjwt_t
    $   unsafePackMallocCString
    <=< throwErrnoIfNull "jwt_encode_str"
    .   c_jwt_encode_str
    
jwtDecode :: ByteString -> ByteString -> JwtIO JwtT
jwtDecode key token = JIO
  $ (if null key then ($ (nullPtr, 0)) else unsafeUseAsCStringLen key) doDecode
 where
  doDecode (p_key, key_len) = useAsCString token $ \p_token ->
    mkJwtT_ "jwt_decode"
      $ \ret -> c_jwt_decode ret p_token p_key $ fromIntegral key_len

mkJwtT_ :: String -> (Ptr PJwtT -> IO CInt) -> IO JwtT
mkJwtT_ loc ctr = alloca $ \ptr -> do
  res <- ctr ptr
  if res == 0 then wrapJwtPtr =<< peek ptr else throwLibjwt loc $ Errno res

Haskell - the world's finest imperative language

Cheatsheet: FFI / custom IO type

Even better - wrap it in a "custom" IO so that nobody calls putStrLn in the inappropriate places

-- | IO restricted to calling /libjwt/ and /jsmn/
newtype JwtIO a = JIO (IO a)
  deriving newtype (Functor, Applicative, Monad, MonadThrow, MonadCatch)
  
unsafePerformJwtIO :: JwtIO a -> a
unsafePerformJwtIO (JIO io) = unsafePerformIO io

unsafePerformIO 🤔

Cheatsheet: unsafePerform(Jwt)IO

signJwt it = MkEncoded $ unsafePerformJwtIO signTokenJwtIo
 where
  signTokenJwtIo = _ -- call C
    
{-# NOINLINE signJwt #-} -- prevent escaping

We know that whatever is going on in JwtIo is pure. The only side-effects it will perform are local (allocating memory). Those effects won't escape the code.

unsafePerformIO exists precisely to enable Haskell to call C

Highlights: RULES pragma

Let's use our domain knowledge: we'll be dealing with lots of constant string (key names). And type-level literals, which we'll use later, are also "constants". And so it happens that they're valid C strings 🤔 ... 💡

addHeader :: String -> ByteString -> JwtT -> JwtIO ()
addHeader h val (JwtT pjwt_t) =
  JIO $ useAsCString val $ \p_val -> withForeignPtr pjwt_t $ \jwt ->
    withCAString h $ \ph -> do
      res <- c_jwt_add_header jwt ph p_val
      if res == 0 then return () else throwLibjwt "jwt_add_jeader" res
      
getGrant :: String -> JwtT -> JwtIO (Maybe ByteString)
getGrant grant (JwtT pjwt_t) = JIO $ withForeignPtr pjwt_t $ \jwt ->
  withCAString grant $ \p_grant -> do
    val <- c_jwt_get_grant jwt p_grant
    if val == nullPtr then return Nothing else Just <$> packCString val

Highlights: RULES pragma

Write the "unsafe" versions that can only work with an already valid C-pointer (Addr# is the GHC term for a valid C pointer) and rewrite rules that tell the re-writer when to use them

unsafeAddHeader :: Addr# -> ByteString -> JwtT -> JwtIO ()
unsafeAddHeader p_header val (JwtT pjwt_t) =
  JIO $ useAsCString val $ \p_val -> withForeignPtr pjwt_t $ \jwt -> do
    res <- c_jwt_add_header jwt (Ptr p_header) p_val
    if res == 0 then return () else throwLibjwt "jwt_add_header" res
    
unsafeGetGrant :: Addr# -> JwtT -> JwtIO (Maybe ByteString)
unsafeGetGrant p_grant (JwtT pjwt_t) = JIO $ withForeignPtr pjwt_t $ \jwt -> do
  val <- c_jwt_get_grant jwt $ Ptr p_grant
  if val == nullPtr then return Nothing else Just <$> packCString val

{-# RULES
"getGrant/unsafeGetGrant" forall s . getGrant (unpackCString# s) = unsafeGetGrant s
"addHeader/unsafeAddHeader" forall s . addHeader (unpackCString# s) = unsafeAddHeader s #-}

{-# INLINE [0] getGrant #-}
{-# INLINE [0] addHeader #-} -- these are needed to reliably fire rules

Cheatsheet: RULES pragma

How to inspect if rules are fired? I use -ddump-simpl-stats GHC option

 

238 RuleFired
...
  4 getGrant/unsafeGetGrant
  3 addGrant/unsafeAddGrant
  3 addGrantInt64/unsafeAddGrantInt64
  3 getGrantInt64/unsafeGetGrantInt64

 

Problem: No type-level programming

If you eliminate extensions enabling type-level programming (TypeFamilies, DataKinds, TypeApplications ...) , you (often) won't be able to:

  •  write the most correct solution (locked in the Worse Is Better),
  •  use your domain knowledge fully (introducing runtime burden)

 

Instead:

  •  encode solution in types,
  •  make sure ergonomics is first class.

 

Problem: Type-level equivalent of Map

{-# LANGUAGE ExistentialQuantification #-}

data Payload pc = ClaimsSet { iss :: Iss
                            , sub :: Sub
                            , aud :: Aud
                            , exp :: Exp
                            , nbf :: Nbf
                            , iat :: Iat
                            , jti :: Jti
                            , privateClaims :: PrivateClaims pc
                            }
                            
data Any = forall a . Any a

newtype PrivateClaims ts = PrivateClaims { unsafeClaimsMap :: HashMap String Any }

Let's start with this: PrivateClaims is parametrized by the precise type of what it contains. This is just an "untyped" Map - the premise is that we can recover from this type all information needed to safely access the map

Cheatsheet: Types, kinds and promotion

Let's imagine that type-system is (a very modest) programming language on its own. In this language, values are Haskell types and types are kinds. E.g:

> :kind! Int
Int :: *
> :kind! Either
Either :: * -> * -> *
> :kind Either Int
Either Int :: * -> *

Compare with:

> :type (5 :: Int)
(5 :: Int) :: Int
> :type Right
Right :: b -> Either a b

Cheatsheet: Types, kinds and promotion

Unfortunately, this language is similar to Python. Untyped. The kind system (as opposed to Haskell's rich type system) has too few kinds: *, *->* and so on.

The remedy is the DataKinds extension, which can extend Haskell's kind system with new kinds (just as the Haskell type system can be extended with new types). It works via datatype promotion - a data type becomes a new kind, its constructors become new types of this kind.

> data Pair a b  = MkPair a b
> :kind! ['MkPair Int Int, 'MkPair Int String, 'MkPair String String]
['MkPair Int Int, 'MkPair Int String, 'MkPair String String] :: [Pair * *]
= '[ 'MkPair Int Int, 'MkPair Int String, 'MkPair String String]
> :kind! ['MkPair Int Int, String]
<interactive>:1:1: error:
    • Expected kind ‘Pair * *’, but ‘String’ has kind ‘*’
    • In the type ‘['MkPair Int Int, String]’

Cheatsheet: Types, kinds and promotion

GHC automatically promotes lists (we saw this in the previous example) but also strings - giving rise to type-level  literals!

 

> :kind! "abc"
"abc" :: GHC.Types.Symbol = "abc"
> :kind! ['MkPair "abc" Int, 'MkPair "def" String]
['MkPair "abc" Int, 'MkPair "def" String] :: [Pair GHC.Types.Symbol *]
= '[ 'MkPair "abc" Int, 'MkPair "def" String]

 

How-to: Type-level equivalent of Map

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeOperators #-}

data ClaimType (a :: Type) = Grant Symbol a

type name ->> a = 'Grant name a

newtype PrivateClaims (ts :: [ClaimType Type]) = PrivateClaims { unsafeClaimsMap :: HashMap String Any }

DataKinds to the rescue!

Information about claims can be encoded as a (kind) list of "pairs" - Symbol (type-level literal) to a regular type

> :kind! PrivateClaims '[ "name" ->> String, "userId" ->> Int ]
PrivateClaims '[ "name" ->> String, "userId" ->> Int ] :: *
= PrivateClaims
    ((':)
       ((->>) "name" String)
       ((':) ((->>) "userId" Int) ('[] )))

Problem: Connecting values and types

First, a teaser: how can you create a value of type ("userId" ->> Int :: ClaimType Int) ?

.

.

.

You can't (because there is no value of ("userId" :: Symbol) )😭.

How-To: Connecting values and types

We need a way to talk about Claims at the term-level - to provide values to be stored in the map. One way to do it:

data ClaimName (name :: Symbol) = MkName

instance (name ~ name') => IsLabel name (ClaimName name') where -- this is for nicer syntax
  fromLabel = MkName

newtype ClaimWitness (name :: Symbol) a = Witness a

(->>) :: ClaimName name -> a -> ClaimWitness name a
_ ->> a = Witness a

And now:

>:set -XOverloadedLabels
> :type #userId ->> (5 :: Int)
#userId ->> (5 :: Int) :: ClaimWitness "userId" Int
> :type (MkName @"userId") ->> (5 :: Int)
(MkName @"userId") ->> (5 :: Int) :: ClaimWitness "userId" Int

How-To: Type-level equivalent of Map

> :t #userId ->> (5 :: Int) .: #name ->> "John Doe" .: nullClaims
#userId ->> (5 :: Int) .: #name ->> "John Doe" .: nullClaims
  :: PrivateClaims
       ((':)
          ((->>) "userId" Int)
          ((':) ((->>) "name" [Char]) Empty))
type Empty = ('[] :: [ClaimType Type])

nullClaims :: PrivateClaims Empty
nullClaims = PrivateClaims HashMap.empty

(.:) :: forall name a ts
      . KnownSymbol name -- this is how GHC does value-type (typeclass equivalent of ClaimName)
     => ClaimWitness name a 
     -> PrivateClaims ts 
     -> PrivateClaims (name ->> a : ts)
(Witness a) .: (PrivateClaims store) = PrivateClaims
  $ HashMap.insert claimName (Any a) store
  where claimName = symbolVal (Proxy :: Proxy name)

How-To: Type-level equivalent of Map

type family UniqueName (name :: Symbol) (ts :: [ClaimType Type]) :: Bool where
  UniqueName _ '[]           = 'True
  UniqueName n (n ->> _ : _) = 'False
  UniqueName n (_ : rest)    = UniqueName n rest

type family RestrictedName (name :: Symbol) :: Bool where
  RestrictedName "iss" = 'True
  RestrictedName "sub" = 'True
  RestrictedName "aud" = 'True
  RestrictedName "exp" = 'True
  RestrictedName "nbf" = 'True
  RestrictedName "iat" = 'True
  RestrictedName "jti" = 'True
  RestrictedName _     = 'False
> :kind! UniqueName "name" ["userId" ->> Int, "name" ->> String]
UniqueName "name" ["userId" ->> Int, "name" ->> String] :: Bool = 'False
> :kind! UniqueName "name" ["userId" ->> Int, "sth" ->> Int]
UniqueName "name" ["userId" ->> Int, "sth" ->> Int] :: Bool = 'True
> :kind! RestrictedName "name"
RestrictedName "name" :: Bool = 'False

Highlights: constraints and custom errors

type family RequireUniqueName (isUnique :: Bool) (name :: Symbol) :: Constraint where
  RequireUniqueName 'True  _  = ()
  RequireUniqueName 'False n  = TypeError ( 'Text "Claim " 
                                            ':<>: 
                                            'ShowType n 
                                            ':<>: 
                                            'Text " is not unique in this claim set" )

type family DisallowRestrictedName (isRestricted :: Bool) (name :: Symbol) :: Constraint where
  DisallowRestrictedName 'False _ = ()
  DisallowRestrictedName 'True  n = TypeError 
    ( 'ShowType n
      ':<>:
      'Text " is the name of the registered claim (it is exposed as a field that must be set directly)")

type family CanAdd n ns :: Constraint where
  CanAdd n ns = ( KnownSymbol n
                , DisallowRestrictedName (RestrictedName n) n
                , RequireUniqueName (UniqueName n ns) n )

How-To: Type-level equivalent of Map

> :type #userId ->> (5 :: Int) .: #userId ->> (6 :: Int) .: nullClaims
<interactive>:1:1: error:
    • Claim "userId" is not unique in this claim set
    • In the expression:
        #userId ->> (5 :: Int) .: #userId ->> (6 :: Int) .: nullClaims
> :t #sub ->> (6 :: Int) .: nullClaims
<interactive>:1:1: error:
    • "sub" is the name of the registered claim (it is exposed as a field that must be set directly)
    • In the expression: #sub ->> (6 :: Int) .: nullClaims
(.:) :: forall name a ts
      . CanAdd name ts
     => ClaimWitness name a 
     -> PrivateClaims ts 
     -> PrivateClaims (name ->> a : ts)
(Witness a) .: (PrivateClaims store) = PrivateClaims
  $ HashMap.insert claimName (Any a) store
  where claimName = symbolVal (Proxy :: Proxy name)

How-To: Type-level equivalent of Map

type family NameExists (name :: Symbol) (ts :: [ClaimType Type]) :: Bool where
  NameExists _ '[]           = 'False
  NameExists n (n ->> _ : _) = 'True
  NameExists n (_ : rest )   = NameExists n rest

type family RequireExists (exists :: Bool) (name :: Symbol) :: Constraint where
  RequireExists 'True  _ = ()
  RequireExists 'False n = TypeError ('Text "Claim " 
                                      ':<>: 
                                      'ShowType n 
                                      ':<>: 'Text " does not exist in this claim set")

type family CanGet n ns :: Constraint where
  CanGet n ns = (KnownSymbol n, RequireExists (NameExists n ns) n)

type family LookupClaimType (name :: Symbol) (ts :: [ClaimType Type]) :: Type where
  LookupClaimType n (n ->> a : _) = a
  LookupClaimType n (_ : rest)    = LookupClaimType n rest

(.!) :: forall name ts
      . CanGet name ts
     => PrivateClaims ts
     -> ClaimName name
     -> LookupClaimType name ts
pc .! name = unAny $ unsafeClaimsMap pc ! claimName
 where
  claimName = symbolVal (Proxy :: Proxy name)
  unAny (Any a) = unsafeCoerce a

How-To: Type-level equivalent of Map

> :set -XOverloadedLabels
> pc = #userId ->> 5 .: #name ->> "John Doe" .: nullClaims
> pc .! #userId
5
> :type pc .! #userId
pc .! #userId :: Integer
> pc .! #name
"John Doe"
> :type pc .! #name
pc .! #name :: [Char]
> :type pc .! #woooooot
<interactive>:9:1: error:
    • Claim "woooooot" does not exist in this claim set
    • In the expression: pc .! #woooooot

The final result

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE OverloadedLabels #-}
{-# LANGUAGE TypeOperators #-}

type MyPayload = '["user_name" ->> String, "is_root" ->> Bool, "user_id" ->> Int]
type MyJwt     = Jwt MyPayload 'NoNs

mkPayload :: IO (Payload MyPayload 'NoNs)
mkPayload = jwtPayload
  (withIssuer "myApp" <> withRecipient "https://myApp.com" <> setTtl 300)
  ( #user_name ->> "John Doe"
  , #is_root   ->> False
  , #user_id   ->> 12345
  )

token :: IO ByteString
token = getToken . sign hmac512 <$> mkPayload

decodeAndValidate :: IO (ValidationNEL ValidationFailure (Validated MyJwt))
decodeAndValidate = jwtFromByteString settings mempty hmac512 =<< token
  where settings = Settings { leeway = 5, appName = Just "https://myApp.com" }

The final result

Thank you!

On Hackage!

GitHub: https://github.com/marcin-rzeznicki/libjwt-typed

Catch me on Reddit: https://www.reddit.com/user/rzeznik

I do NOT use Twitter or Facebook

I am looking for a Haskell job

Building a web library using super hard Haskell

By Marcin Rzeźnicki

Building a web library using super hard Haskell

  • 798