Search This Blog

Tuesday, November 04, 2008

& the Audio Driver Incident

Several months ago, I (finally) upgraded my computer. My old one was a 1.8 GHz Athlon XP (single 32-bit core) with 1.25 gigs RAM and a GeForce 3; in other words, it was 2002 or 2003 hardware. My new computer is a 2.4 GHz Core 2 (quad 64-bit cores) with 4 gigs RAM and a Radeon 4850; depending on the benchmarks, my new CPU is 10-18x as fast as my old one, if you count all 4 cores. After trying various voodoo to try to get my old XP installation to run on my new computer (despite the fact that it wouldn't have been able to use about a gig of my RAM), I ultimately gave up and installed Windows Server 2008 64-bit. After dealing with a whole bunch of problems getting various stuff working with 64-bit 2008, things ultimately ended up being acceptable, and I've used that ever since.

However, a couple relatively minor problems have been pretty long-standing, and continued until a few days ago. One was easy to diagnose: Firefox was leaking memory like heck. For every day I left my computer on, Firefox would grow in RAM usage by a couple hundred megs, getting up to a good 2 gigs on occasion (I usually kill it before it gets to that point). While this was certainly an annoyance, it wasn't much of a problem, as I have 4 gigs memory, and I can simply restart it to reclaim all the leaked memory whenever it gets so large it becomes a problem.

One was much harder to diagnose, however. Something else was leaking memory in addition to Firefox, and it was not clear what was causing this. Total system memory usage would increase over days, and if you ignored Firefox, would end up using up all of my 4 gigs memory by about 2 weeks since the last reboot. Unlike with Firefox, there was no apparent problem - no single process was showing a significant accumulation of memory, nor were excess processes being created, leaving 1-2 gigs of memory I couldn't account for. So, I went several months without knowing what the problem was, usually handling it by restarting my computer every week or so.

Then, one day my dad called me from work to ask me why his computer at work was sometimes performing poorly. So I had him look through the process list and system statistics and look for memory leaks, excessive CPU usage, etc. As I don't have the exact terminology used on those pages memorized, I also opened up the listing on my computer to be sure I told him to look for the right things.

This brought something very curious to my attention: the total handle count for my computer was over 4 million. This is a VERY large number of handles; normally computers don't have more than 20-50k handles at a time - 2 orders of magnitude less than what my computer was experiencing. This was an almost certain indication that something was leaking handles on a massive scale. After adding the handles column to the process list, I found that audiodg.exe was the process with some 98% of those handles. Some looking online revealed that that process is a host for audio driver components and DRM. Some further looking for audiodg.exe and handle leaks found some reverse-engineering by one person that showed that this was due to the Sonic Focus audio driver for my Asus motherboard leaking registry handles.

Fortunately, there was an updated driver available by this time that addresses the issue. As my computer was currently at 96% RAM usage (the worst it's ever been - usually I reboot it before it gets to this point), I immediately installed the driver and restarted the audio services (of which audiodg.exe is one). This resulted in a shocking instant 1.3 gig drop in kernel memory usage to less than 400 megs total. It's been one and a half days since then, and audiodg.exe currently is using 226 handles, suggesting that the problem is either dead or drastically reduced (it has increased by like 70 handles in those 1.5 days); and even if it is still leaking handles, 50 handles a day is a tolorable leakage, as that's only like 10 k/day.

So, this whole thing revealed that Windows is quote robust. Given that most computers never go above 50k handles, I was very surprised that Windows was able to handle 6.6 million handles (the highest I've ever seen it get to) without falling over and dying (although this wouldn't have been possible with a 32-bit version of Windows, as that 1.7 gigs of kernel memory wouldn't have fit in the 2 gig kernel address space after memory-mapped devices have memory space allocated). Traditionally, Unix has had a limit of 1024 file handles per process, though I don't know what's typical these days (I know at least some Unix OS have that as a configurable option).

After pursuing that problem to its conclusion, I decided to do some more looking for handle leaks in other processes. While the average process used only 200-500 handles, a number a processes (which are not abnormally high) get as high as 2k handles. However, one process - smc.exe, a part of Symantec Antivirus - has almost 50 k handles allocated, making it a good candidate for a handle leak. Looking at the process in Process Explorer shows that a good 95% of these handles are of the same type - specifically, unnamed event handles - providing further evidence in support of handle leakage. That's as far as I've gotten so far; I haven't spent much time investigating the problem, or looking for an analysis online (though the brief searches I did didn't find anything related to this). So, that's work for the future.

1 comment:

Barry said...

I'm experienced the smc.exe handle leak. Oddly, it's only started recently. Or seems to. Affecting a few of our SQL servers in the office. None of the others are having issues, just two so far. Seems to be spreading. Hope it's not an update of some sort that is causing it!