Advanced Concurrency aspects

Written by: Igor Korotach

Fast recap!

  • What is a thread?

  • What is a process?

  • How are they different?

Types of concurrency

Synchronization

What is synchronization?

Synchronization is the process of allowing threads to execute one after another.

Synchronization controls the access the multiple threads to a shared resources. Without synchronization of threads, one thread can modify a shared variable while another thread can update the same shared variable, which leads to significant errors.

Data Races

public class Counter {

  protected int count = 0;

  public void add(int value){
    this.count = this.count + value;
  }
}

// this.count = 0;

// Thread A
// Thread B

// A:  Reads this.count into a register (0)
// B:  Reads this.count into a register (0)
// B:  Adds value 2 to register
// B:  Writes register value (2) back to memory. this.count now equals 2
// A:  Adds value 3 to register
// A:  Writes register value (3) back to memory. this.count now equals 3

Mutex

public class Counter {

  protected final Lock _mutex = new ReentrantLock(true);
  protected int count = 0;

  public void add(int value){
    this._mutex.lock();
    this.count = this.count + value;
    this._mutex.unlock();
  }
}

Read Write Lock

class RWDictionary {
    private final Map<String, Data> m = new TreeMap<String, Data>();
    private final ReentrantReadWriteLock rwl = new ReentrantReadWriteLock();
    private final Lock r = rwl.readLock();
    private final Lock w = rwl.writeLock();

    public Data get(String key) {
        r.lock();
        try { return m.get(key); }
        finally { r.unlock(); }
    }
    public String[] allKeys() {
        r.lock();
        try { return m.keySet().toArray(); }
        finally { r.unlock(); }
    }
    public Data put(String key, Data value) {
        w.lock();
        try { return m.put(key, value); }
        finally { w.unlock(); }
    }
    public void clear() {
        w.lock();
        try { m.clear(); }
        finally { w.unlock(); }
    }
 }

Semaphore

class LoginQueueUsingSemaphore {

    private Semaphore semaphore;

    public LoginQueueUsingSemaphore(int slotLimit) {
        semaphore = new Semaphore(slotLimit);
    }

    boolean tryLogin() {
        return semaphore.tryAcquire();
    }

    void logout() {
        semaphore.release();
    }

    int availableSlots() {
        return semaphore.availablePermits();
    }

}

More niche synchronization primitives

  • Signal
  • Event
  • SpinLock
  • Barrier
  • WeightedSemaphore
  • Condition

Problems with synchronized threads concept

  1. Hard to get it right (forgetting to lock/unlock)
  2. Doesn't offer a generic solution, requires tailoring for specific task
  3. Hard to debug
  4. Hard to make neither fail-fast nor fail-safe
  5. Is not very scalable (100k connection problem)

What are the alternatives? 

Let's start with underlying concepts

File Descriptors

In simple words, when you open a file, the operating system creates an entry to represent that file and store the information about that opened file. So if there are 100 files opened in your OS then there will be 100 entries in OS (somewhere in kernel). These entries are represented by integers like (...100, 101, 102....). This entry number is the file descriptor.

 

In C, stdin, stdout, and stderr are FILE*, which in UNIX respectively map to file descriptors 0, 1 and 2.

 

Similarly, when you open a network socket, it is also represented by an integer and it is called Socket Descriptor. I hope you understand.

File Descriptors

Sockets

Linux I/O models

  1. Blocking I/O

  2. Non-blocking I/O

  3. I/O multiplexing

  4. Asynchronous I/O

Waiting

Poll()

Select()

Epoll()

Recommendations & tips

  • Select() is the most portable, supported by all Unix-like OSes

  • Poll() doesn't require a statically sized & known number of descriptors. Works better than select() for large-valued descriptors

  • Epoll() is more difficult to construct, however has the best performance. Is Linux specific so not portable

 

Event loop

Advanced Communication models

Actor model

Actor Usage

defmodule Example do
  def listen do
    receive do
      {:ok, "hello"} -> IO.puts("World")
    end

    listen()
  end
end

iex> pid = spawn(Example, :listen, [])
#PID<0.108.0>

iex> send pid, {:ok, "hello"}
World
{:ok, "hello"}

iex> send pid, :ok
:ok

CSP

Communicating Sequential Processes

CSP Usage

package main
import "fmt"
func sendValues(myIntChannel chan int){

  for i:=0; i<5; i++ {
    myIntChannel <- i 
  }

}

func main() {
  myIntChannel := make(chan int)

  go sendValues(myIntChannel) // function sending value

  for i:=0; i<5; i++ {
    fmt.Println(<-myIntChannel) //receiving value
  }
}

STM

Software Transactional Memory

STM Usage

(def acc1 (ref 1000 :validator #(>= % 0)))
(def acc2 (ref 1000 :validator #(>= % 0)))
(defn transfer [from-acct to-acct amt]
  (dosync
    (alter to-acct + amt)
    (alter from-acct - amt)))
(dotimes [_ 1000]
  (future (transfer acc2 acc1 100)))

Advanced Execution Models

Fiber

A Fiber is a lightweight thread that uses cooperative multitasking instead of preemptive multitasking. A running fiber must explicitly "yield" to allow another fiber to run, which makes their implementation much easier than kernel or user threads.

 

Programming languages that use fibers: Ruby

Coroutine

A Coroutine is a component that generalizes a subroutine to allow multiple entry points for suspending and resuming execution at certain locations. Unlike subroutines, coroutines can exit by calling other coroutines, which may later return to the point where they were invoked in the original coroutine.

 

Programming languages that use coroutines: Lua, Python asyncio, Node.js

Green Thread

A Green Thread is a thread that is scheduled by a virtual machine (VM) instead of natively by the underlying operating system. Green threads emulate multithreaded environments without relying on any native OS capabilities, and they are managed in user space instead of kernel space, enabling them to work in environments that do not have native thread support.

 

Programming languages that use green threads: Go, Elixir/Erlang, Python (via greenlet library)

Thanks for your attention. You've been awesome!

Questions?

  • Presentation link: https://slides.com/emulebest/advanced-concurrency-aspects

Advanced Concurrency Aspects

By Igor Korotach

Advanced Concurrency Aspects

  • 190