Search This Blog

Friday, May 27, 2005

Fast Semaphores - Faster Than Slow

Okay. We have our semaphore, we have our event, and we have our atomic functions. Now what do we do with them? Well, you can do lots of things. Just to give you a little demonstration, we'll start off with something odd: a fast semaphore.

So what's fast? Well, fast and slow are terms used to describe where synchronization objects do their work. Remember that there are two modes the processor can be in: user mode, and kernel mode: kernel mode is where the OS (specifically the kernel) and drivers live, and user mode is where the programs running live. To access kernel functions through the OS API, your program must call into kernel mode, and then kernel mode will jump back to your program when it's done.

These jumps are not cheap. In fact, even a simple call into the kernel and back can take 1000 cycles on Windows on x86 (I don't have benchmark data for other platforms). While 1000 cycles isn't much, given the current processor speed of several billion cycles per second, when you've got something like a mutex (or semaphore) that you may lock just to do a few cycles of shared variable changing, and need to lock it many times each second, 2000 cycles (one call to enter the mutex, and another to leave it) can be quite a hefty price.

Both the event and semaphore classes we've created so far have been slow - every call goes into kernel mode. But with atomic functions, we can do better. We're going to make fast version of the semaphore and event that take only some 20-30 cycles. Sound better? Well, here's how.

A semaphore is just a counter that can control threads; likewise, an event is a boolean that can control threads. The variables we can do in user mode with atomic functions. But there's one thing we'll never be able to do in user mode, and that's alter the run status of a thread - make it go to sleep or wake up; that can only be done by the kernel. Thus, by definition, a fast synchronization object stays in user mode, except to go to sleep or wake up another thread. If the thread never needs to do either of those, it stays totally in user mode. The result of this, if you're locking something very frequently, and for very short periods of time, is significant performance gains.

No comments: