Search This Blog

Sunday, May 29, 2005

Asynchronous I/O - Options

I've seen three options for asynchronous I/O that are at least halfway viable. The difference between each of these options consist not of how I/O requests are made, but of how the application is notified that an I/O request has completed (be it successfully or unsuccessfully).

The first is event-based notification. Event-based notification requires that the calling program create an event for each read or write request. The OS will remember which event goes with which request, and then set that event when the operation has completed. Afterwards, the application may call a function to get the status of the request (whether or not it has failed, and how much was read or written).

The use of this isn't readily apparent. Obviously it would be far too cumbersome to use for most asynchronous I/O, as it would require that the application remember every request it made, and then poll the events periodically, not to mention the huge waste of resources it would be to allocate a large number of events for this purpose.

Yet it is useful for one thing: simulating synchronous I/O. I don't know if this is the case with the POSIX APIs, but in Windows, files are opened for either synchronous or asynchronous access, and you cannot mix calls for a file. Consequently, if you have a place in the application where you need to do I/O synchronously on a file opened asynchronously (such as verifying the header for a file is valid), this allows you to do so.

The second method of asynchronous I/O notification is completion callback routines. In this case, the method of notification is a callback function which the application passes to the read/write function, and the OS calls when the I/O completes, with the parameters and the completion status of the request. This is a very generic, very versatile method, but the way it's implemented in various OSes leaves something to be desired.

The final method is something of a cross between event-based and completion callback notifications: the notification port. A notification port is essentially a message queue to which I/O completion notifications are posted. Threads can retrieve a notification from the port in a first-in-first-out manner, and can wait on the port if no notifications are remaining.

The really cool thing about notification ports (although it's common to other message queues, as well) is that more than one thread can wait on one port simultaneously. This makes it trivial to set up thread pools that are based on processing following I/O.

Windows NT introduced kernel support for notification ports (called I/O completion ports) in version 3.5. While possible to implement a message queue (or to marshal I/O completion notifications to a message queue) in user mode, doing it in the kernel gives some very nice features, such as optimization of CPU use. It's usually considered a good idea to create more threads in a thread pool than CPUs in the computer. The reason is that the threads may go into wait for something not related to the notification port, such as doing synchronous I/O while processing a notification. If there are more threads than CPUs, that ensures that even if some threads go into wait, other threads will be ready to keep all CPUs busy.

The problem with that is that it is wasteful in a different way. It's moderately expensive for a CPU to stop working on one thread and start working on another - this happens every time a thread's time-slice expires. If there are more threads running than CPUs to run on, this will happen more often - every 60 to 120 ms, on Windows NT. I/O completion ports were designed to solve both problems, by tracking (in the kernel) exactly when threads go into wait. If one thread from the notification pool goes into wait while processing a notification message, another thread will be run from the pool; yet no more threads in the pool than CPUs will be allowed to run concurrently. This is possible because the kernel knows exactly how many threads are running, and how many are waiting; this is something that can't reasonably be done without kernel support.

No comments: