Okay, so. As one of the primary uses of LibQ would be to support threads and thread synchronization across the Windows and POSIX platforms (I'm particularly interested in supporting Linux and OS X, but I'd be happy if it worked elsewhere, as well) and you can't do fast thread synchronization without atomic functions, that was the obvious place to start work on LibQ.
As I was already well aware of the interlocked functions on Windows (InterlockedCompareExchange, etc.), I started out by looking at the POSIX standard. Well, unless I'm missing something, there's nothing there for this purpose. So, that means we have to write it, ourselves.
This actually isn't as redundant as it may sound, at first. Despite the fact that Windows has the interlocked series of functions, we can't really use them. A look at the requirements for InterlockedCompareExchange reveal the following:
Client: Included in Windows XP, Windows 2000 Professional, Windows NT Workstation 4.0, Windows Me, and Windows 98.
Server: Included in Windows Server 2003, Windows 2000 Server, and Windows NT Server 4.0.
No Windows 95, no NT pre-4.0. While I'm not so concerned about the lack of NT pre-4.0, I am concerned about the lack of Windows 95 support. No matter; it's not difficult to write the functions ourselves. Unfortunately, since we're going to be using assembly, we'll need a version of the functions for each archetecture we want to support. For now, I'll let this slide - I can write the x86 version myself.
A quick look through the Intel x86 instruction reference shows us our palette of functions (we don't really need to worry about using instructions other processors might not have, because most other processors are RISC, and operate in a more generic way). We have xchg (exchange), xadd (exchange add), cmpxchg (compare exchange), bts (test bit and set bit), btr (test bit and clear bit), btc (test bit and compliment bit), and lock (lock the bus for the duration of the instruction). These look like plenty to build a robust atomic operations sytem (although in reality, all we absolutely need is the compare exchange and lock bus - everything else can be emulated using those; this will become important later).
So, the list of prototypes I whipped up look like this:
// Adds nAddend to lpnNumber and returns old value of lpnNumber
long AtomicExchangeAdd(long volatile *lpnNumber, long nAddend);
// Sets lpnDest to nSource, and returns old value of lpnDest
long AtomicExchange(long volatile *lpnDest, long nSource);
// Sets lpnDest to nSource if lpnDest equals nComparand, and returns old value of lpnDest
long AtomicCompareExchange(long volatile *lpnDest, long nSource, long nComparand);
// Sets bit nBit in lpnDest to bSet, and returns old value of the bit (nonzero if set)
long AtomicBitExchange(long volatile *lpnDest, long nBit, long bSet);
// Toggles bit nBit of lpnDest, and returns old value of the bit (nonzero if set)
long AtomicBitExchangeCompliment(long volatile *lpnDest, long iBit);
// Adds nAddend to lpnDest if lpnDest's sign (high bit) is bSigned, and returns old value of lpnDest
long AtomicSignedExchangeAdd(long volatile *lpnDest, long nAddend, bool bSigned);
(the last one is a special purpose emulated function; we'll see what uses it has later)