TPL, Async & Soluto

AgenDa

  • TPL in brief
  • Tasks
  • Async
  • Promises and async/await
  • Practices and pitfalls
  • Soluto use cases

Task PARALLEL LIBRARY

  • Introduced in .net 4.0
  • Aimed to leverage multi-core processors
  • Better abstractions to design concurrent applications.
  • Designed to handle concurrency challenges better
    • Synchronization - locking, dead locks, race condition
    • Debugging
    • Code Complexity

Task parallel library, A PEEk inside...

  • Provide data structures aimed for concurrency
    • ConcurrentQueue<T>, ConcurrentDictionary<T>, Lazy<T>
  • Provide data-parallelism capabilities
    • AsParallel, PLINQ...
    • Partition the data
    • Distribute some logic on data on multiple-threads or cores 
  •  Task parallelism capabilities ...

TASK PARALLELISM

  • Break program into multiple operations
  • Run operations in parallel
  • Combine operations with continuation
  • Schedule and keep track on ongoing operations

TASKS


  • TASK<T> -  represents a T that is being computed asynchronously and will be available in the future, if it isn't already

  • TASK - same with task, but without return value, a bit like Task<void>

  • Based on future pattern

How do we create tasks?

  • Using Task.Run 
    • Task.Run( ()=> DoOtherThing() )
    • Run code on TaskScheduler (ThreadPool)
  • Task.FromResult()
  • From other tasks
  • From promises (TaskCompletionSource)

HOW DO WE CONSUME TASKS?

  • Wait(), Result - blocking , should be avoided in most cases.
 Task a = Task.Run( ()=> CountToTen())
 Launch(a.Result); 
  • ContinueWith() - continuation, allow us to schedule a new task, that get the original resolved task as an argument.

Task a = Task.Run( ()=> CountToTen());
a.ContinueWith( t => Launch(t.Result))); 

    HOW DO WE CONSUME TASKS?

    • Task not only wrap the result, but also wrap the exception if one has occurred
     Task someTask = Task.Run(()=>doSomethingDangerous()) someTask.ContinueWith((t)=> {
        if (t.IsFaulted)        Log(t.Exception)
        else
            Launch(t.Result)
    });
    • Could also be written as:
     someTask.ContinueWith( t=> Log(t.Exception), TaskContinuationOptions.OnlyOnFaulted)someTask.ContinueWith( t=> Launch(t.Result), TaskContinuationOptions.OnlyOnRanToCompletion);
    • Task can only resolve once, it has final state.

    Task continuation

     var firstTask = Task.Run(()=>GetSomeData()) var secondTask = firstTask.ContinueWith((t) => ProcessSomeData(t.Result)) var thirdTask = secondTask.ContinueWith((t)=>sendProcessedData(t.Result))

    • Every continuation wraps a Function delegate and return a new Task with the result of the function.

    TASK Continuation

    • Tasks are passable as objects
    void Click_GetDataButton(Object o, EventArgs e)
    {
        GetData().ContinueWith((t)=>showMessage(t.Result))
    }
    Task<string> GetData()
    {    Task<IDbClient> firstTask = Task.Run(()=>getDBClient())    Task<IEnumarable<String>> secondTask = firstTask.ContinueWith((t)=>t.Result.query(someQuery))
    
        Task<string> thirdTask = 
                           secondTask.ContinueWith((t)=> someMerge(t.Result))    return thirdTask;
    }

    Run Tasks in ParalLel

     Task<int> firstParallelTask = Task.Run( ()=> ComputeOneThing() )
    Task<int> secondParallelTask = Task.Run( ()=> ComputeOtherThing() ) Task<int[]> allTasks = Task.WhenAll(firstParallelTask ,secondParallelTask);
    We can also use continuation
    Task<int> mergeTask = allTasks.ContinuteWith( (t) => SomeMerge(t.Result[0], t.Result[1]))
    And we can still pass it between methods
    return mergeTask; //wrap the result of the merge opeation

    Other "nice to have'



    • Cancellation - We can pass a cancellation token to abort operation. Tokens can propagate along the continuation chain. (for example downloading a large file)

    • Progress - IProgress interface allow us to notify the caller on progress update. (for example downloading a large file)

    Async programming

    Threads and IO

    • Threads are memory expensive
    • Context switches are expensive
    • Using threads, we trade efficiency for responsiveness
    • IO operations are long, if we create a thread per IO operation we waste lots of memory and CPU for thread management.
    • And  in common web application servers, we have at least one thread per incoming request that simply wait for db.
    • So, it doesn't scale that well.

    Threads and IO

    Why is it the default programming model for the web?

      • Web is been for years the one of lowest level of programming.
      • It's simpler and safer.
      • Writing asynchronous code is more difficult, and  you can easily find yourself in a callback hell.
      • DB was believed to be the bottleneck in most of web applications, and the benefits we can get from this optimization was simply not worth it for most cases.
      • Web used mainly request - response based operations,
        no long-lasting operations

    THREADS and io

    This is no long the case...

    C10k problem - allow servers to handle 10K concurrent connections.

    Solutions that solve this problem based on async io:
    Nginx, Node.js, Play, Netty, EventMachine, Lighttpd, Tornado, Play, OTP,  IIS...
    in many different languages and OS environments,
    all of them use async based IO

    Whatsapp managed to scale 2 million connections on single machine bace in 2012 using erlang and freebsd.

    So, how's async io works?

    • So, the idea is instead of sending an IO request and waiting for response, we put the request in some queue, and get a notification with the data when it's ready.
    • We can look at it simply as callback, underlying implementation can be different based on runtime,
      OS and/or hardware drivers. 

    SO, HOW'S ASYNC IO Looks?

    In .net in the past we had two patterns for async io:

    • APM - Asynchronous Programming model
      • BeginXXX, EndXXX, IAsyncResult, AsyncDelegate

    • EAP - Event-based Asynchronous pattern
      • WebClient.DownloadStringCompleted += ()...


    Since .net 4.5, we have a new pattern, TAP

    Task-based asynchronous pattern

    • Introduced in .net 4.5, based entirely on Tasks
    • IO classes/clients have methods with Async/TaskAsync suffix.
    • All those method return task<T>.
    • For example : WebClientDownloadStringTaskAsync (url)

    TASK-BASED ASYNCHRONOUS PATTERN

  • There are many benefits for this approach:
    • Tasks have final state, so you can attach continuation in any time.
    • They wrap the result of the operation, making them passable as objects.
    • Tasks are compos-able, as we saw, we can join tasks together, and create another tasks
    • Tasks wrap exceptions, and are easier to debug
    • Tasks support cancellation
    • And a bit more...

    FUTURES And promises

    • TaskCompletionSource - allow us to create an object, that can resolve an underlying task.

    public Task<int> CheckWheaterInTheLastMinute(){   var wheaterTask = new TaskCompletionSource<int>()   var list = new List<int>();   var timer = new Timer( (_) => {       list.Add(MeasureCurrentTempature());       if (list.Count == 60)       {           timer.Dispose();           wheaterTask.SetResult(list.Average());       }   }, null, 0, 1000);   return wheaterTask.Task;}

    Futures and promise are powerful

    and they are everywhere...

    • Javascript - jQuery deferred, ES6 Promises, Q...
    • Java -  Future, Java 8 CompletableFuture, google Guava ListenableFuture, SettableFuture
    • .Net Tasks & TaskCompletionSource
    • C++ std::future, std::promise
    • There are basically implementations for almost every language.

      http://en.wikipedia.org/wiki/Futures_and_promises

    C# ASYNC-AWAIT

    Futures are cool but...

    Continuation can be a pain

    • Writing in a synchronous manner is still simpler 
    • Even with continuation and named methods, async code can be still ugly, and less readable
    ReadString()
    .ContinueWith((t)=> ConnectToDb(t.Result)).Unwrap().ContinueWith((t)=> RetrieveData(t.Result)).Unwrap().ContinueWith((t)=> HandleResponse(t.Result))
    • Exception handling is less intuitive, and can be confusing.
    • But since .net 4.5, we have a better option: Async/Await!

    Some example

      public Task<int> CountSomething()
      {
        Task<string> conStringTask = ReadConnectionStringAsync();
        Task<IDbProvider> dbTask = conStringTask.ContinueWith((t)=> 
    ConnectToDbAsync(t.Result), TaskContinuationOption.OnlyOnRanToCompletion).Unwrap(); Task<int> queryTask = dbTask.ContinueWith((t)=>t.Result.QueryAsync("count * from something", TaskContinuationOption.OnlyOnRanToCompletion)).Unwrap(); return queryTask; }

    C# ASYNC/AWAIT

    • Implement continuation in a synchronized fashion using code generation.
    • Simple with magic keywords, async and await.
    • Same Example:
      public async Task<int> CountSomething()
      {
         try
         {
              string conString = await ReadConnectionStringAsync();
              IDbProvider db = await ConnectToDbAsync(conString);
         }
         catch (Exception e)
         {
            //handle exception
         }
         int someCalc = await db.QueryAsync("count * from something");
         return someCalc;
      } 

    What async means?

    Putting async declaration on a method or a lambda means that: 

    • That we can use await in that method body
    • That we are going to return a Task (async void is exception)
    • The task object that wraps the result is auto generated by the compiler

    What await means?

    We can use await only:
  • In async methods and lambdas
  • Only on Task object* 
  • When we awaiting Task,
          we can extract the result of the completed task.
  •   Color favoriteColor = await GetUserFavoriteColorFromDb(userId)
  • When we use await statement, we make our task yield execution.

    * Objects that implement the method GetAwaiter()
    can also be awaited.  
    For example, AsyncLazy and Rx/.Net IObservable.
  • Parallel async-await

    public async Task<int> SumSomeData(key1, key2)
    {   
       try
       {     string conString = await ReadConnectionStringAsync()
         IDbProvider db = await ConnectToDbAsync(conString);
       }
       catch (Exception e)
       {
          //handle exception
       }   Task<int> someDataTask = db.ReadAsync(key1);
       Task<int> otherDataTask = db.ReadAsync(key2);   // Both tasks run in parallel!
       int sum = (await someDataTask) + (await otherDataTask);
       return sum; // use var instead of explicit types in real code
    }  

    So, should we use it?

    DEFINITELY yes!

    • async-await reduce most shortcoming related to writing async code.
    • It makes flow better, and provide much easier exception handling.
    • Using "async all the way" significantly improve scaleability, and performance in some cases.
    • Tasks & Futures are really good abstraction for writing high-concurrent apps, and async/await plays with them perfectly.
    • The only different cases are heavily optimized code, or legacy code in which migration won't produce enough benefits.

    Practices and pitfalls

    • When we are in async context, we must not use blocking code, a Task represent an async operation, and since we didn't use Task.Run, we are running on shared context with other async methods.

    Practices and pitfall

    • Which means that instead of:
      • Thread.Sleep -> use await Task.Delay() instead
      • Locks -> use async-supported locking mechanism if you have to such as SemaphoreSlim
      • Parallel.For/ForEach -> use await Task.WhenAll
      • Lazy<T> -> use AsyncLazy<T> instead
      • IO operations -> use async versions.
      • MVC Filters -> WebApi async Filters
      • Task.WaitAll/WaitAny -> await Task.WhenAll/WhenAny
    • Any other blocking operation that doesn't have async implementation should run with await Task.Run()

      PRACTICES AND PITFALLS

      • APIs, interfaces,  that deals with asynchronous operations should always works with Tasks. 
      • async keyword is only relevant for implementations, not interface declaration.
      • Task Based api should have only one return value that is wrapped in the task object. 
      • No 'Out' arguments.
      • We should avoid mutating objects as much as possible.

      Practices and pitfalls

      • Async methods are the simplest way to create tasks
      • TaskCompletionSource<T> -> can be used to translate EAP, Timer-based, or any other asynchronous mechanism to Tasks, mainly used for IO operations.
      • Task.Run -> Should be used to cpu-bound operations.
      • Task.Run -> could also be used for migrating older code as a temporary solution.

      Practices and pitfalls

      • Task objects mustn't be null.
      • Remember! Async methods always return Task.
       public async Task<string> GetSomeString()
      {   return null; //still return Task that wrap the value null}
      • Non-async methods that return task should never return null Task.

       public Task<string> GetSomeString()
      { //return null -> don't use! return Task.FromResult(null);}

      PRactices and pitfalls

      • Don't use Task Constructor.
      • Async all the way - avoid consuming tasks in a blocking way. Don't use Task.Result, Task.Wait(), Task.WaitAll(), Task.WaitAny() - in simple cases, it will result in a performance degradation, in worst case it can cause deadlocks due to SynchronizationContext.
      • Unit Testing are exceptional.

      PRACTICES AND PITFALLS

      • Use FakeItEasy in testing - TypeMock doesn't play well with task objects.
      • Tasks can be easily faked with Result via Task.FromResult

      practices and pitfalls

      • Async can also be used in lambdas
      int avg = await Task.WhenAll(StocksLists.Select( async () => await GetCurrentPriceAsync())).Average();
      • Or even simpler:
       int avg = await (StocksLists.Select( async () => await GetCurrentPriceAsync()).WhenAll()).Average();

      
      

      Soluto use cases

      • TableDataContext ->have task-based Api that wrap azure storage async method.
      • LiveRole -> Queue message to agent, was turned into async to improve scalability.
      • LiveStateWeb -> Check the live state of agent, created as Async asp.net mvc role, we also use TaskCompletionSource to wrap messaging async event based Api.
      • AnalyticsListenerRole -> Asp.net mvc async role, aimed to handle incoming analytics event.
      • Mobile View -> All mobile view endpoints were converted to async, it's still not optimal tough because of the heavily use of await Task.Run on older not task-based Api.

      QUESTIONS?


      TPL, Async & Soluto

      By yshayy

      TPL, Async & Soluto

      • 1,489