Search This Blog

Thursday, May 04, 2006

Very Squishy

I mentioned a few weeks ago that I had an algorithm for a lock-free memory allocator. I actually didn't look much at the algorithm for it until today, but it's pretty straightforward; allow me to make you drool a bit.

As mentioned previously, it's lock free. As with all lock free structures, that means that wasted cycles due to competition for it will be very low, even when many threads are trying to use it at once.

But the big deal is that it's zero fragmentation. Fragmentation is what happens when a small block gets freed, when there is still data allocated on both sides of the block. The block can now be reused, but only for requests for blocks the same size or smaller. This can have two bad effects. First, it can wear down your memory over time, by progressively creating smaller and smaller holes. At some point these holes cease to be useful, and they wind up being unusable allocated space. Second, it means that you have to keep a sorted list of free blocks, and go searching through it every time you need a block, to see if you can find one large enough; this makes allocations O(N) with respect to the number of free blocks.

Having a no fragmentation heap means that you can kiss both problems goodbye. You can run this thing for all eternity and you won't have any fragmentation. It's possible (if not probable) that some memory will still go unused, but that's simply because demand for blocks of a given size varies; if demand really high at one point, a bunch of memory will be allocated, and when demand drops, some of that memory will go unused. This isn't the same thing as fragmentation; the memory is still just as useful as the first time it was allocated, it just isn't being used.

Not only that, but I've got three words for you: constant time allocation. No matter how much you use this thing, it'll always be just as fast to allocate memory. Even better, this constant time is very low; likely significantly faster than a single call to malloc (the default implementation, anyway). It doesn't get much better than this.

However, there's a caveat. The simplest and surest way of implementing a zero-fragmentation allocator (and indeed what this one uses) is with a simple trick: you have separate heaps for memory of different sizes, and round allocations up, so that all blocks in a single heap are of the same size. This makes it inherently wasteful (at least when the blocks getting allocated aren't exactly the size of each block of the heap). The amount of waste may be tweaked by how you slice up your heaps. I'll have to do some playing around with this. Here's what I'm thinking at the moment for heap block sizes (note that I haven't spent much time thinking about this yet; I'm basically coming up with these numbers as I type):
- one heap for each power of two and 1.5x each power of two from 32 bytes to 8 KB
- one heap for 1, 1.25, 1.5, 1.75x each power of two from 8 KB to 64 KB
- anything larger than 64 KB gets allocated directly by the OS (VirtualAlloc, etc.)

More or less that means that for blocks 32 bytes to 8 KB, no more than 1/2 of the requested size will be wasted. For blocks 8-64 KB no more than 1/4 of the requested size will be wasted.

Increasing the number of heaps would serve to reduce the waste, but be aware that all allocations will have to locate the correct heap. That means it will do a binary search on the table of heaps, which takes O(log N) time. Also, increasing the number of heaps wastes memory in another way: all memory must eventually be allocated from the OS in large (probably 64 KB) blocks. Each heap that has at least 1 item allocated at some point will have one of these blocks associated with it. If you have a 64 KB block of memory that only has a single 32 byte allocation, that's a rather catastrophic (percentage-wise, anyway) waste of memory. On the other hand, if you're writing something like World of Warcraft, which allocates hundreds of megs of memory of widely varying sizes, a few megs of wasted space for unused blocks (1 megs would be 16 heaps) is probably worth it. So, I'll probably allow you to tweak the heap organization for your own projects as needed.

Wow, that post turned out to be way longer than I expected. Story of my life, I guess...

No comments: