Written by: Igor Korotach
Thanks for the idea to: Alexander Dolgarev
How is it different from multithreading?
Async vs Multithreading
Reactor Pattern
Linux: epoll
BSD: kqueue
WIndows: overlapped IO (completion ports)
Polling model. Non-blocking IO
Multiplexing IO is even cooler!
Some Python code, at last!
async def sock_recv(self, sock, n):
"""Receive data from the socket.
The return value is a bytes object representing the data received.
The maximum amount of data to be received at once is specified by
nbytes.
"""
if self._debug and sock.gettimeout() != 0:
raise ValueError("the socket must be non-blocking")
try:
return sock.recv(n)
except (BlockingIOError, InterruptedError):
pass
fut = self.create_future()
fd = sock.fileno()
self.add_reader(fd, self._sock_recv, fut, sock, n)
fut.add_done_callback(
functools.partial(self._sock_read_done, fd))
return await fut
An example in action:
class MySocket:
def __init__(self):
self.__host = 'mysite.com'
self.__port = 4637
self.__recv_handler = None
sock = socket.socket(socket.AF_INET)
context = ssl.create_default_context(ssl.Purpose.SERVER_AUTH)
self.__conn = context.wrap_socket(sock, server_hostname=self.__host)
def connect(self):
self.__conn.connect((self.__host, self.__port))
self.__recv_handler = asyncio.ensure_future(self.__recive())
def send(self, data):
self.__conn.write(data.encode())
async def __recive(self):
while True:
data = await asyncio.get_event_loop().sock_recv(self.__conn, 256)
data = data.decode('utf-8')
print('<< ' + data)
async def main():
my_sock = MySocket()
my_sock.connect()
my_sock.send('ping')
asyncio.sleep(0.2)
Okay, that's easy, let's take another example, now with file system operations:
import asyncio
from aiofile import AIOFile
async def main():
async with AIOFile("/tmp/hello.txt", 'w+') as afp:
await afp.write("Hello ")
await afp.write("world", offset=7)
await afp.fsync()
print(await afp.read())
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
def run_in_thread(func, *args, **kwargs) -> asyncio.Future:
loop = kwargs.pop('loop') # type: asyncio.AbstractEventLoop
assert not loop.is_closed(), "Event loop is closed"
assert loop.is_running(), "Event loop is not running"
return loop.run_in_executor(None, partial(func, *args, **kwargs))
Linux: io_submit(2)
BSD: aio(4)
Windows: no direct analogue
1. You should always use O_DIRECT or it defeats the purpose.
1. You should always use O_DIRECT or it defeats the purpose.
2. You need to minimize the number of file system metadata operations that can block and/or bypass the file system altogether.
1. You should always use O_DIRECT or it defeats the purpose.
2. You need to minimize the number of file system metadata operations that can block and/or bypass the file system altogether.
3. You need to figure out the best way to schedule all of the disk I/O operations you are now responsible for. Background writing, prefetching, etc that are used to optimize disk performance are now part of your implementation.
http://man7.org/linux/man-pages/man2/io_submit.2.html
Finally, make a pure Python wrapper
Presentation link: https://slides.com/emulebest/async-fs-python
LinkedIn: https://www.linkedin.com/in/igor-korotach-806435154/