Lecture 6

about RealWorld

Table of content

getChar

getchar :: Char
get2chars = [getchar, getchar]

Problememes?

1. Because the Haskell compiler treats all functions as pure, it can avoid «excessive» calls to `getchar` and use one returned value twice.

 

2. Even if it does make two calls, there is no way to determine which call should be performed first. Do you want to return the two chars in the order in which they were read, or in the opposite order? Nothing in the definition of `get2chars` answers this question.

Haskell is pure!

let f a = a + 1
f 1 = 2
f 1 = 2

Trying to write getChar

getchar   :: Int -> Char
get2chars :: Int -> String
 
get2chars _ = [getchar 1, getchar 2]
getchar :: Int -> (Char, Int)

get2chars i = [a,b] where (a,i1) = getchar i
                          (b,i2) = getchar i1

-- Problems?

getChar difference

getChar order

get4chars = [get2chars 1, get2chars 2]

order of 'get2chars' calls isn't defined

get4chars

get4chars :: Int -> String
get4chars i0 = (a++b) where (a,i1) = get2chars i0
                            (b,i2) = get2chars i1

Believe it or not, but we've just constructed the whole "monadic" Haskell I/O system.

get2chars :: Int -> (String, Int)
get2chars i0 = ([a,b], i2)  where (a,i1) = getchar i0
                                  (b,i2) = getchar i1

Welcome to RealWorld, baby

type IO a  =  RealWorld -> (a, RealWorld)

main :: RealWorld -> ((), RealWorld)
main :: IO ()

The 'main' Haskell function has the type:

getChar :: RealWorld -> (Char, RealWorld)
 
main :: RealWorld -> ((), RealWorld)
main world0 = 
    let (a, world1) = getChar world0
        (b, world2) = getChar world1
    in ((), world2)
  1. Can we skip getChar call?
  2. Can we swap getChar calls?
  3. Duplicate getChar calls?

Truth behind IO (1 / 2)

newtype IO a = IO {unIO :: State# RealWorld -> (State# RealWorld, a)}

Real implementation of IO type

{-# LANGUAGE MagicHash #-}  -- allows using # in names

data Mystery# a = Magic# a deriving (Show)
ghci> Magic# 3
Magic# 3
ghci> :t Magic# 3
Magic# 3 :: Num a => Mystery# a

Truth behind IO (2 / 2)

Now to State# and RealWorld

data State# s
data RealWorld
instance Monad IO where
  IO m >>= k = IO $ \ s -> 
     case m s of 
       (new_s, a) -> unIO (k a) new_s
  
  return x = IO (\s -> (s, x))

Monad IO

newtype IO a = IO {unIO :: State# RealWorld -> (State# RealWorld, a)}

do notation: desugaring >>

(>>) :: IO a -> IO b -> IO b
(action1 >> action2) world0 =
   let (_, world1) = action1 world0
       (b, world2) = action2 world1
   in (b, world2)
putStrLn :: String -> IO ()

main = do putStrLn "Hello!"
main = putStrLn "Hello!"
main = do putStrLn "What is your name?"
          putStrLn "How old are you?"
          putStrLn "Nice day!"

main = putStrLn "What is your name?" >>
       putStrLn "How old are you?"   >>
       putStrLn "Nice day!"

Sequence of actions

ioActions :: [IO ()]
ioActions = [ print "Hello!"
            , putStr "just kidding"
            , getChar >> return ()
            ]
main = do head ioActions
          ioActions !! 1
          last ioActions

Easy to execute them all! But ugly :(

sequence_ :: [IO a] -> IO ()

main = sequence_ ioActions

Nice

sequence_ :: [IO a] -> IO ()
sequence_ l =
sequence_ :: [IO a] -> IO ()
sequence_ []     = return ()
sequence_ (x:xs) = do x
                      sequence_ xs

do notation: desugaring <-

(>>=) :: IO a -> (a -> IO b) -> IO b
(action1 >>= action2) world0 =
   let (a, world1) = action1 world0
       (b, world2) = action2 a world1
   in (b, world2)
getLine :: IO String
main = do s <- getLine
          putStrLn s
(>>=) :: IO a -> (a -> RealWorld -> (b, RealWorld)) -> IO b
main = getLine >>= putStrLn
main = getLine >>= \s ->
       putStrLn s
ghci> s <- getLine
Hello, world
ghci> putStrLn $ s ++ "!"
Hello, world!

GHCi = infinite do-block inside IO

do exercise: ommit do syntax

main = do putStr "What is your name?"
          a <- readLn
          putStr "How old are you?"
          b <- readLn
          print (a,b)
main =    putStr "What is your name?" >>
          readLn >>= \a -> 
          putStr "How old are you?" >>
          readLn >>= \b -> 
          print (a,b)

return!

return :: a -> IO a
return a world0 =
return :: a -> IO a
return a world0 = (a, world0)
getReversedLine :: IO String
getReversedLine = do
    s <- getLine
    return $ reverse s

main :: IO ()
main = do
    rs <- getReversedLine
    putStrLn rs
main = do a <- readLn
          if a >= 0 then 
              return ()
          else do
              putStrLn "a is negative"

          putStrLn "a is positive"  -- is this executed?

I want to break free

let's do do

main :: IO ()
main = do
    s <- getLine
    let rs = reverse s
    putStrLn $ "Reversed input : " ++ rs

You can use let bindings inside do.

main :: IO ()
main =     getLine >>= \s -> 
           let rs = reverse s in
           putStrLn $ "Reversed input : " ++ rs

Desugars into:

Common errors

let s = getLine  -- !!! Doesn't read from console to `s`
rs <- reverse s  -- !!! `reverse s` is not a monadic action inside IO

do without Monad

pythagoras :: Int -> Int -> Int
pythagoras x y = do
    let x2 = x ^ 2
    let y2 = y ^ 2
    x2 + y2

You don't need Monad to use do-syntax

ghci> pythagoras 3 4 
25

Puzzledo

foo :: Int -> Int
foo = do
    a <- (+1)
    return (a * 2)

do-notation for partially applied functions

ghci> foo 3
8

Unexpected IO problems...

main = do
  fileContent <- readFile "foo.txt"
  writeFile "bar.txt" ('a':fileContent)
  readFile  "bar.txt" >>= putStrLn

Main.hs

foo
bar

foo.txt

ghci> :run main
afoo
bar
main = do
  fileContent <- readFile "foo.txt"
  writeFile "foo.txt" ('a':fileContent)
  readFile  "foo.txt" >>= putStrLn

After small change...

ghci> :run main
*** Exception: foo.txt: openFile: resource busy (file is locked)
main = do
  fileContent <- readFile "foo.txt"
  putStrLn fileContent
  writeFile "foo.txt" ('a':fileContent)

Lazy IO

-- | The 'readFile' function reads a file and
-- returns the contents of the file as a string.
-- The file is read lazily, on demand, as with 'getContents'.

readFile        :: FilePath -> IO String
readFile name   =  openFile name ReadMode >>= hGetContents

Golden programming rule:

* when something goes wrong — it's time to read documentation

Platinum programming rule:

* always read documentation first

import System.IO (withFile, hGetContents, IOMode (ReadMode))

main :: IO ()
main = do
    fileData <- withFile "foo.txt" ReadMode hGetContents
    putStr fileData

Another example

ghci> :run main
*** Exception: foo.txt: hGetContents: illegal operation 
                        (delayed read on closed handle)

How to live with lazy I/O?

Use some modern streaming and data-processing library:

* conduit

* pipes

* streaming

FFI: interaction with other worlds

/* clang -c simple.c -o simple.o */

int example(int a, int b)
{
  return a + b;
}
-- ghc simple.o simple_ffi.hs -o simple_ffi

{-# LANGUAGE ForeignFunctionInterface #-}

import Foreign.C.Types

foreign import ccall safe "example" 
    example :: CInt -> CInt -> CInt

main = print (example 42 27)

C

Haskell

Mutable data in pure lazy world

main = do let a0 = readVariable  varA
          let _  = writeVariable varA 1
          let a1 = readVariable  varA
          print (a0, a1)
import Data.IORef (newIORef, readIORef, writeIORef)

foo :: IO ()
foo = do 
    varA <- newIORef 0 
    a0   <- readIORef varA
    writeIORef varA 1
    a1   <- readIORef varA
    print (a0, a1)

Use IORef in IO

Problememes?

ghci> foo
(0,1)

Simple Mutable IO Arrays

import Data.Array.IO (IOArray, newArray, readArray, writeArray)

bar :: IO ()
bar = do 
    arr <- newArray (1,10) 37 :: IO (IOArray Int Int)
    a   <- readArray arr 1
    writeArray arr 1 64
    b   <- readArray arr 1
    print (a, b)
ghci> bar
(37,64)

IOArray is very simple but no so very fast. Use vector package for fast both mutable and immutable arrays. Vector package is really great!

IO Exceptions (throw, )

throwIO :: Exception e => e -> IO a
import           Control.Exception (ArithException (..), catch, throwIO)
import           Control.Monad     (when)

readAndDivide :: IO Int
readAndDivide = do
    x <- readLn
    y <- readLn
    when (y == 0) $ throwIO DivideByZero
    return $ x `div` y
ghci> readAndDivide 
7
3
2
ghci> readAndDivide 
3
0
*** Exception: divide by zero

IO Exceptions (catch, )

catch :: Exception e => IO a -> (e -> IO a) -> IO a
safeReadAndDivide :: IO Int
safeReadAndDivide = readAndDivide `catch` \DivideByZero -> return (-1)
ghci> safeReadAndDivide 
7
3
2
ghci> safeReadAndDivide 
3
0
-1

1. IO can throw ANY exception.

2. Not represented in type system.

This will be discussed later in course.

IO Exceptions (my exceptions, )

{-# LANGUAGE DeriveAnyClass     #-}
{-# LANGUAGE DeriveDataTypeable #-}

import           Control.Exception (Exception)
import           Data.Typeable     (Typeable)

data MyException = DummyException
    deriving (Show, Typeable, Exception)
ghci> throwIO DummyException 
*** Exception: DummyException

ghci> :{
ghci| throwIO DummyException `catch` \DummyException ->
ghci|     putStrLn "Dummy exception is thrown"
ghci| :}
Dummy exception is thrown

IO Exceptions (and company)

try     :: Exception e => IO a -> IO (Either e a)
tryJust :: Exception e => (e -> Maybe b) -> IO a -> IO (Either b a)

finally :: IO a	 -- computation to run first
        -> IO b	 -- computation to run afterward (even if an exception was raised)
        -> IO a

-- | Like 'finally', but only performs the final action 
-- if there was an exception raised by the computation.
onException :: IO a -> IO b -> IO a
bracket :: IO a         -- ^ computation to run first (\"acquire resource\")
        -> (a -> IO b)  -- ^ computation to run last (\"release resource\")
        -> (a -> IO c)  -- ^ computation to run in-between
        -> IO c         -- returns the value from the in-between computation

Idiomatic RAII implementation

Reminding of pattern guards

lookup :: FiniteMap -> Int -> Maybe Int

addLookup :: FiniteMap -> Int -> Int -> Maybe Int
addLookup env var1 var2
 | Just val1 <- lookup env var1
 , Just val2 <- lookup env var2
 = val1 + val2
{-...other equations...-}

This is not do-notation, just same syntax with different meaning in patterns. Enabled by default, no need of extra extensions.

strangeOperation :: [Int] -> Ordering
strangeOperation xs 
   | 7  <- sum xs
   , n  <- length xs
   , n  >= 5
   , n  <= 20 
   = EQ
   | 1  <- sum xs
   , 18 <- length xs
   , r  <- nub xs `compare` [1,2,3]
   , r /= EQ
   = r
   | otherwise
   = [3,1,2] `compare` xs

main = print $ strangeOperation ([5,7..21] ++ [20,19..4])

Welcome to Dark side of IO

import System.IO.Unsafe

foo :: ()
foo = unsafePerformIO $ putStrLn "foo"

bar :: String
bar = unsafePerformIO $ do
          putStrLn "bar"
          return "baz"

main = do let f = foo
          putStrLn bar
helper i = print i >> return i

main = do
    one <- helper 1
    two <- helper 2
    print $ one + two

Even more Unsafe*

import System.IO.Unsafe

helper i = unsafePerformIO $ print i >> return i

main = do
    let one = helper 1
    let two = helper 2
    print $ one + two
import System.IO.Unsafe

helper i = print i >> return i

main = do
    one <- helper 1
    let two = unsafePerformIO $ helper 2
    print $ one + two
  1. Run helper 1.

  2. Create a thunk to run helper 2.

  3. Evaluate one + two, forcing the helper 2 thunk to be evaluated in the process.

  4. Print the result of one + two, a.k.a. 3.

Ok?

  1. Create and evaluate the helper 2 thunk.

  2. Run helper 1.

  3. Print the result of one + two.

Also valid

State tokens

instance Monad IO where
  IO m >>= IO k = IO $ \ s -> 
     case m s of 
       (new_s, a) -> unIO (k a) new_s
  
  return x = IO (\s -> (s, x))
newtype IO a = IO {unIO :: State# RealWorld -> (State# RealWorld, a)}

Ignorance in Haskell

type S# = State# RealWorld  -- let's use this short alias

print 1 :: S# -> (S#, ())

print 1 >> print 2 = 
    \s0 -> case print 1 s0 of
               (s1, _ignored) -> print 2 s1
unsafePerformIO (IO f) =
    case f fakeStateToken of
        (_ignoredStateToken, result) -> result

Desugaring IO

import System.IO.Unsafe

helper i = print i >> return i

main = do
    one <- helper 1
    let two = unsafePerformIO $ helper 2
    print $ one + two
main s0 =
    case helper 1 s0 of
        (s1, one) ->
            case helper 2 fakeStateToken of
                (_ignored, two) ->
                    print (one + two) s1
main s0 =
    case helper 2 fakeStateToken of
        (_ignored, two) ->
            case helper 1 s0 of
                (s1, one) ->
                    print (one + two) s1

What to do?

1. Whenever possible, avoid using unsafe functions.

 

2.If you aren't in the IO monad at all, or it's acceptable if the action is performed before other IO actions, use unsafePerformIO.

Real life example of unsafe*

import Debug.Trace

trace     :: String -> a -> a
traceShow :: Show a => a -> b -> b
traceM    :: Applicative f => String -> f ()

Debugging pure functions

trace :: String -> a -> a
trace string expr = unsafePerformIO $ do
    traceIO string  -- slightly clever version of `putStrLn`
    return expr
fib :: Int -> Int
fib 0 = 0
fib 1 = 1
fib n = trace ("n: " ++ show n) $ fib (n - 1) + fib (n - 2)
ghci> putStrLn $ "fib 4:\n" ++ show (fib 4)
fib 4:
n: 4
n: 3
n: 2
n: 2
3

Text & ByteString

type String = [Char]
{-# LANGUAGE OverloadedStrings #-}

class IsString a where
    fromString :: String -> a
ghci> :t "foo"
"foo" :: [Char]

ghci> :set -XOverloadedStrings

ghci> :t "foo"
"foo" :: IsString a => a

Text

{-# LANGUAGE OverloadedStrings #-}

import qualified Data.Text as T

-- From pack
myTStr1 :: T.Text
myTStr1 = T.pack ("foo" :: String)

-- From overloaded string literal.
myTStr2 :: T.Text
myTStr2 = "bar"

ByteString

{-# LANGUAGE OverloadedStrings #-}

import qualified Data.ByteString       as S
import qualified Data.ByteString.Char8 as S8

-- From pack
bstr1 :: S.ByteString
bstr1 = S.pack ("foo" :: String)

-- From overloaded string literal.
bstr2 :: S8.ByteString
bstr2 = "bar"

ByteString and unsafePerformIO

data ByteString = PS (ForeignPtr Word8) Int Int

unsafeCreate :: Int -> (Ptr Word8 -> IO ()) -> ByteString
unsafeCreate l f = unsafePerformIO (create l f)

create :: Int -> (Ptr Word8 -> IO ()) -> IO ByteString
create l f = do
    fp <- mallocByteString l
    withForeignPtr fp $ \p -> f p
    return $ PS fp 0 l
    
-- | /O(1)/ 'splitAt' @n xs@ is equivalent to @('take' n xs, 'drop' n xs)@.
splitAt :: Int -> ByteString -> (ByteString, ByteString)

-- | /O(1)/ Extract the last element of a ByteString.
last :: ByteString -> Word8
last ps@(PS x s l)
    | null ps   = errorEmptyList "last"
    | otherwise = unsafePerformIO $
                    withForeignPtr x $ \p -> peekByteOff p (s+l-1)

What to use?

I. Binary:

    Packed:

        Lazy: Data.ByteString.Lazy

        Strict: Data.ByteString

What to use?

II. Text:

    1. ASCII or 8-bit:

        Packed and lazy: Data.ByteString.Lazy.Char8

        Packed and strict: Data.ByteString.Char8,               

                                         Data.CompactString.ASCII

                                    or Data.CompactString with Latin1

What to use?

II. Text:

    2. Unicode:

        UTF-32: 

            Unpacked and lazy: [Char]

        UTF-16:

            Packed and lazy: Data.Text.Lazy

            Packed and strict: Data.Text or Data.CompactString.UTF16

        UTF-8:

            Unpacked and lazy: Codec.Binary.UTF8.Generic contains generic

                                                operations that can be used to process [Word8].

            Packed and lazy: Data.ByteString.Lazy.UTF8

            Packed and strict: Data.CompactString.UTF8 or Data.ByteString.UTF8

Read list from real world

Lecture 06: IO

By ITMO CTD Haskell

Lecture 06: IO

Lecture about IO monad and whole system, do-notation (+ApplicativeDo, +RebindableSyntax), unsafePerformIO, efficient String representations.

  • 6,357