AgenDa
- TPL in brief
- Tasks
- Async
- Promises and async/await
- Practices and pitfalls
- Soluto use cases
Task PARALLEL LIBRARY
- Introduced in .net 4.0
- Aimed to leverage multi-core processors
- Better abstractions to design concurrent applications.
-
Designed to handle concurrency challenges better
- Synchronization - locking, dead locks, race condition
- Debugging
- Code Complexity
Task parallel library, A PEEk inside...
- Provide data structures aimed for concurrency
- ConcurrentQueue<T>, ConcurrentDictionary<T>, Lazy<T>
-
Provide data-parallelism capabilities
- AsParallel, PLINQ...
- Partition the data
- Distribute some logic on data on multiple-threads or cores
- Task parallelism capabilities ...
TASK PARALLELISM
- Break program into multiple operations
- Run operations in parallel
- Combine operations with continuation
- Schedule and keep track on ongoing operations
TASKS
- TASK<T> - represents a T that is being computed asynchronously and will be available in the future, if it isn't already
- TASK - same with task, but without return value, a bit like Task<void>
How do we create tasks?
- Using Task.Run
- Task.Run( ()=> DoOtherThing() )
- Run code on TaskScheduler (ThreadPool)
- Task.FromResult()
- From other tasks
- From promises (TaskCompletionSource)
HOW DO WE CONSUME TASKS?
- Wait(), Result - blocking , should be avoided in most cases.
Task a = Task.Run( ()=> CountToTen())
Launch(a.Result);
- ContinueWith() - continuation, allow us to schedule a new task, that get the original resolved task as an argument.
Task a = Task.Run( ()=> CountToTen());
a.ContinueWith( t => Launch(t.Result)));
HOW DO WE CONSUME TASKS?
- Task not only wrap the result, but also wrap the exception if one has occurred
Task someTask = Task.Run(()=>doSomethingDangerous())
someTask.ContinueWith((t)=> {
if (t.IsFaulted)
Log(t.Exception)
else
Launch(t.Result)
});
- Could also be written as:
someTask.ContinueWith( t=> Log(t.Exception), TaskContinuationOptions.OnlyOnFaulted)
someTask.ContinueWith( t=> Launch(t.Result), TaskContinuationOptions.OnlyOnRanToCompletion);
- Task can only resolve once, it has final state.
Task continuation
var firstTask = Task.Run(()=>GetSomeData())
var secondTask = firstTask.ContinueWith((t) => ProcessSomeData(t.Result))
var thirdTask = secondTask.ContinueWith((t)=>sendProcessedData(t.Result))
- Every continuation wraps a Function delegate and return a new Task with the result of the function.
TASK Continuation
- Tasks are passable as objects
void Click_GetDataButton(Object o, EventArgs e)
{
GetData().ContinueWith((t)=>showMessage(t.Result))
}
Task<string> GetData()
{
Task<IDbClient> firstTask = Task.Run(()=>getDBClient())
Task<IEnumarable<String>> secondTask = firstTask.ContinueWith((t)=>t.Result.query(someQuery))
Task<string> thirdTask =
secondTask.ContinueWith((t)=> someMerge(t.Result))
return thirdTask;
}
Run Tasks in ParalLel
Task<int> firstParallelTask = Task.Run( ()=> ComputeOneThing() )
Task<int> secondParallelTask = Task.Run( ()=> ComputeOtherThing() )
Task<int[]> allTasks = Task.WhenAll(firstParallelTask ,secondParallelTask);
We can also use continuation
Task<int> mergeTask = allTasks.ContinuteWith( (t) => SomeMerge(t.Result[0], t.Result[1]))
And we can still pass it between methods
return mergeTask; //wrap the result of the merge opeation
Other "nice to have'
- Cancellation - We can pass a cancellation token to abort operation. Tokens can propagate along the continuation chain. (for example downloading a large file)
-
Progress - IProgress interface allow us to notify the caller on progress update. (for example downloading a large file)
Threads and IO
- Threads are memory expensive
- Context switches are expensive
- Using threads, we trade efficiency for responsiveness
- IO operations are long, if we create a thread per IO operation we waste lots of memory and CPU for thread management.
- And in common web application servers, we have at least one thread per incoming request that simply wait for db.
- So, it doesn't scale that well.
Threads and IO
Why is it the default programming model for the web?
- Web is been for years the one of lowest level of programming.
- It's simpler and safer.
- Writing asynchronous code is more difficult, and you can easily find yourself in a callback hell.
- DB was believed to be the bottleneck in most of web applications, and the benefits we can get from this optimization was simply not worth it for most cases.
- Web used mainly request - response based operations,
no long-lasting operations
THREADS and io
This is no long the case...
C10k problem - allow servers to handle 10K concurrent connections.
Solutions that solve this problem based on async io:
Nginx, Node.js, Play, Netty, EventMachine, Lighttpd, Tornado, Play, OTP, IIS...
in many different languages and OS environments,
all of them use async based IO
Whatsapp managed to scale 2 million connections on single machine bace in 2012 using erlang and freebsd.
So, how's async io works?
- So, the idea is instead of sending an IO request and waiting for response, we put the request in some queue, and get a notification with the data when it's ready.
- We can look at it simply as callback, underlying implementation can be different based on runtime,
OS and/or hardware drivers.
SO, HOW'S ASYNC IO Looks?
In .net in the past we had two patterns for async io:
-
APM - Asynchronous Programming model
- BeginXXX, EndXXX, IAsyncResult, AsyncDelegate
- EAP - Event-based Asynchronous pattern
- WebClient.DownloadStringCompleted += ()...
Since .net 4.5, we have a new pattern, TAP
Task-based asynchronous pattern
- Introduced in .net 4.5, based entirely on Tasks
- IO classes/clients have methods with Async/TaskAsync suffix.
- All those method return task<T>.
- For example : WebClientDownloadStringTaskAsync (url)
TASK-BASED ASYNCHRONOUS PATTERN
There are many benefits for this approach:
- Tasks have final state, so you can attach continuation in any time.
- They wrap the result of the operation, making them passable as objects.
- Tasks are compos-able, as we saw, we can join tasks together, and create another tasks
- Tasks wrap exceptions, and are easier to debug
- Tasks support cancellation
- And a bit more...
FUTURES And promises
- TaskCompletionSource - allow us to create an object, that can resolve an underlying task.
public Task<int> CheckWheaterInTheLastMinute()
{
var wheaterTask = new TaskCompletionSource<int>()
var list = new List<int>();
var timer = new Timer( (_) => {
list.Add(MeasureCurrentTempature());
if (list.Count == 60)
{
timer.Dispose();
wheaterTask.SetResult(list.Average());
}
}, null, 0, 1000);
return wheaterTask.Task;
}
Futures and promise are powerful
and they are everywhere...
- Javascript - jQuery deferred, ES6 Promises, Q...
- Java - Future, Java 8 CompletableFuture, google Guava ListenableFuture, SettableFuture
- .Net Tasks & TaskCompletionSource
- C++ std::future, std::promise
- There are basically implementations for almost every language.
http://en.wikipedia.org/wiki/Futures_and_promises
Futures are cool but...
Continuation can be a pain
- Writing in a synchronous manner is still simpler
- Even with continuation and named methods, async code can be still ugly, and less readable
ReadString()
.ContinueWith((t)=> ConnectToDb(t.Result)).Unwrap()
.ContinueWith((t)=> RetrieveData(t.Result)).Unwrap()
.ContinueWith((t)=> HandleResponse(t.Result))
- Exception handling is less intuitive, and can be confusing.
- But since .net 4.5, we have a better option: Async/Await!
Some example
public Task<int> CountSomething()
{
Task<string> conStringTask = ReadConnectionStringAsync();
Task<IDbProvider> dbTask = conStringTask.ContinueWith((t)=>
ConnectToDbAsync(t.Result), TaskContinuationOption.OnlyOnRanToCompletion).Unwrap();
Task<int> queryTask = dbTask.ContinueWith((t)=>t.Result.QueryAsync("count * from something", TaskContinuationOption.OnlyOnRanToCompletion)).Unwrap();
return queryTask;
}
What async means?
Putting async declaration on a method or a lambda means that:
- That we can use await in that method body
- That we are going to return a Task (async void is exception)
- The task object that wraps the result is auto generated by the compiler
What await means?
We can use await only:
In async methods and lambdas
Only on Task object*
When we awaiting Task,
we can extract the result of the completed task.
Color favoriteColor = await GetUserFavoriteColorFromDb(userId)
When we use await statement, we make our task yield execution.
* Objects that implement the method GetAwaiter()
can also be awaited.
For example, AsyncLazy and Rx/.Net IObservable.
Parallel async-await
public async Task<int> SumSomeData(key1, key2)
{
try
{
string conString = await ReadConnectionStringAsync()
IDbProvider db = await ConnectToDbAsync(conString);
}
catch (Exception e)
{
//handle exception
}
Task<int> someDataTask = db.ReadAsync(key1);
Task<int> otherDataTask = db.ReadAsync(key2);
// Both tasks run in parallel!
int sum = (await someDataTask) + (await otherDataTask);
return sum; // use var instead of explicit types in real code
}
So, should we use it?
DEFINITELY yes!
- async-await reduce most shortcoming related to writing async code.
- It makes flow better, and provide much easier exception handling.
- Using "async all the way" significantly improve scaleability, and performance in some cases.
- Tasks & Futures are really good abstraction for writing high-concurrent apps, and async/await plays with them perfectly.
- The only different cases are heavily optimized code, or legacy code in which migration won't produce enough benefits.
Practices and pitfalls
- When we are in async context, we must not use blocking code, a Task represent an async operation, and since we didn't use Task.Run, we are running on shared context with other async methods.
Practices and pitfall
- Which means that instead of:
- Thread.Sleep -> use await Task.Delay() instead
- Locks -> use async-supported locking mechanism if you have to such as SemaphoreSlim
- Parallel.For/ForEach -> use await Task.WhenAll
- Lazy<T> -> use AsyncLazy<T> instead
- IO operations -> use async versions.
- MVC Filters -> WebApi async Filters
- Task.WaitAll/WaitAny -> await Task.WhenAll/WhenAny
- Any other blocking operation that doesn't have async implementation should run with await Task.Run()
PRACTICES AND PITFALLS
- APIs, interfaces, that deals with asynchronous operations should always works with Tasks.
- async keyword is only relevant for implementations, not interface declaration.
- Task Based api should have only one return value that is wrapped in the task object.
- No 'Out' arguments.
- We should avoid mutating objects as much as possible.
Practices and pitfalls
- Async methods are the simplest way to create tasks
- TaskCompletionSource<T> -> can be used to translate EAP, Timer-based, or any other asynchronous mechanism to Tasks, mainly used for IO operations.
- Task.Run -> Should be used to cpu-bound operations.
- Task.Run -> could also be used for migrating older code as a temporary solution.
Practices and pitfalls
- Task objects mustn't be null.
- Remember! Async methods always return Task.
public async Task<string> GetSomeString()
{
return null; //still return Task that wrap the value null
}
- Non-async methods that return task should never return null Task.
public Task<string> GetSomeString()
{
//return null -> don't use!
return Task.FromResult(null);
}
PRactices and pitfalls
- Don't use Task Constructor.
- Async all the way - avoid consuming tasks in a blocking way. Don't use Task.Result, Task.Wait(), Task.WaitAll(), Task.WaitAny() - in simple cases, it will result in a performance degradation, in worst case it can cause deadlocks due to SynchronizationContext.
- Unit Testing are exceptional.
PRACTICES AND PITFALLS
- Use FakeItEasy in testing - TypeMock doesn't play well with task objects.
- Tasks can be easily faked with Result via Task.FromResult
practices and pitfalls
-
Async can also be used in lambdas
int avg = await Task.WhenAll(StocksLists.Select( async () => await GetCurrentPriceAsync())).Average();
int avg = await (StocksLists.Select( async () => await GetCurrentPriceAsync()).WhenAll()).Average();
Soluto use cases
- TableDataContext ->have task-based Api that wrap azure storage async method.
- LiveRole -> Queue message to agent, was turned into async to improve scalability.
- LiveStateWeb -> Check the live state of agent, created as Async asp.net mvc role, we also use TaskCompletionSource to wrap messaging async event based Api.
- AnalyticsListenerRole -> Asp.net mvc async role, aimed to handle incoming analytics event.
- Mobile View -> All mobile view endpoints were converted to async, it's still not optimal tough because of the heavily use of await Task.Run on older not task-based Api.