So, I was running IE to use the web-page-based version of our product (our product is actually pretty useful, although some of the horribly buggy internal testing tools we have to use give me nightmares, and because of such problems I can no longer run the standalone version of the program). Anyway, when I finished using IE and closed it, WinDbg popped up and notified me that IE just blew up.
However, the way in which it blew up was rather unusual: it ran out of stack space, before which it threw about 1200 exceptions. In fact, it seemed apparent that it was throwing exceptions from the exception handler, as the stack revealed an equal number of KiUserExceptionDispatcher stack frames, one on top of the next.
Consulting with Skywing revealed that this was the case, as the exception it was throwing repeatedly was STATUS_NONCONTINUABLE_EXCEPTION, indicating that the exception handler was trying to continue execution following a noncontinuable exception, resulting in another exception, which tries to continue execution, and so on, until the thread runs out of stack and goes boom.
However, looking waaaaaay back in the log, we can see that the original exception thrown is some unusual (not built-in) exception 0x0eedfade. As well, it appears that the original exception was thrown by RaiseException, which was called from comcasttoolbar.dll, meaning it was probably intentionally thrown, as a C++-style exception.
So, what exactly went wrong? Well, let's start by looking at the immediate cause: the exception handler that caught the exception.
00f4179d xor eax,eax
00f4179f push ebp
00f417a0 push 0xf418b8
00f417a5 push dword ptr fs:[eax]
00f417a8 mov fs:[eax],esp
That's the one, right there - 0xf418b8. However, looking in this function, it always returns ExceptionContinueSearch before the program goes boom (and in fact it stops getting called well before that); that's pretty much a positive indication that this isn't the exception handler we're looking for.
Okay, well, that wasn't quite what I was expecting (was expecting the immediate cause to be as idiotic and immediately obvious as the ultimate cause). Anyway, moving on up [the exception handler chain]... Breakpointing on the call to the handler in NTDLL reveals that the next handler is 0xf24515, which is... also not what we're looking for. Okay... fast forward a bit (9 exception handlers, to be precise)... and we come to 0xf24c64, which returns ExceptionContinueExecution; that's the one we're looking for. Let's have a look...
00f24c64 mov eax,[esp+0x4]
00f24c68 test dword ptr [eax+0x4],0x6
00f24c6f jne COMCAS_1+0x4cfe (00f24cfe)
00f24c75 cmp byte ptr [COMCAS_1!DllUnregisterServer+0xc76c8 (0108c028)],0x0
00f24c7c ja COMCAS_1+0x4c8d (00f24c8d)
00f24c7e lea eax,[esp+0x4]
00f24c82 push eax
00f24c83 call UnhandledExceptionFilter (00f21340)
00f24c88 cmp eax,0x0
00f24c8b jz COMCAS_1+0x4cfe (00f24cfe)
...do a bunch of stuff that checks for exception code 0x0eedfade, and results in the program being terminated with an error message...
00f24cfe xor eax,eax
00f24d00 ret
The branch at 00f24c8b gets taken, resulting in eax (the return value) getting set to 0 (ExceptionContinueExecution), causing Windows to attempt to continue execution. If I intercept that branch and cause it to not be taken, a message box that says "Runtime error 217 at 00F4182E" appears, and clicking on OK results in IE terminating relatively normally.
What this code is doing is checking what the default exception handler behavior is. UnhandledExceptionFilter returns EXCEPTION_EXECUTE_HANDLER if Windows has displayed an exception dialog. This is the default behavior most of the time; but when there is a just-in-time debugger installed (like on my computer), UnhandledExceptionFilter starts up the debugger and returns EXCEPTION_CONTINUE_SEARCH. In response to this, the exception handler is supposed to return ExceptionContinueSearch, so that the exception will be rethrown and caught by the debugger. But this function isn't returning ExceptionContinueSearch - it's returning ExceptionContinueExecution. Why? Well, let's take a look at what _except_handler returns:
typedef enum _EXCEPTION_DISPOSITION {
ExceptionContinueExecution = 0,
ExceptionContinueSearch = 1,
ExceptionNestedException = 2,
ExceptionCollidedUnwind = 3
} EXCEPTION_DISPOSITION;
And here's what UnhandledExceptionFilter returns:
#define EXCEPTION_EXECUTE_HANDLER 1
#define EXCEPTION_CONTINUE_SEARCH 0
#define EXCEPTION_CONTINUE_EXECUTION -1
It sure looks like the coder of this exception handler got the two mixed up, and returned ExceptionContinueExecution when they thought they were returning EXCEPTION_CONTINUE_SEARCH. However, that'd be a bit weird, as you normally do not code _except_handler functions at all - the compiler builds them for you you, based on the Windows __try/__except/__finally syntax or C++ try/catch syntax.
Doing some looking at the DLL in a hex editor suggests that it was written in Delphi (either that or Borland C++ Builder). Looking at other exception handlers, I'm really unsure whether or not this exception handler is kosher. There are three other exception handlers that call UnhandledException filter. They all start out with the following basic logic:
.text:00404A3C mov eax, [esp+4]
.text:00404A40 test dword ptr [eax+4], 6
.text:00404A47 jnz loc_404AF8
.text:00404A4D cmp dword ptr [eax], 0EEDFADEh
.text:00404A53 cld
.text:00404A54 call @System@_16542 ; System::_16542
.text:00404A59 jz short loc_404A82
...one of the handlers has some extra logic here...
.text:00404A5B cmp byte_56C02C, 0
.text:00404A62 jbe short loc_404A82
.text:00404A64 cmp byte_56C028, 0
.text:00404A6B ja short loc_404A82
.text:00404A6D lea eax, [esp+4]
.text:00404A71 push eax
.text:00404A72 call UnhandledExceptionFilter
.text:00404A77 cmp eax, 0
.text:00404A7A jz short loc_404AF8
.text:00404A7C mov eax, [esp+4]
.text:00404A80 jmp short loc_404A94
...handle Delphi exceptions here...
.text:00404AF8 mov eax, 1
.text:00404AFD retn
Notice what it's doing. It first checks the exception flags. If neither flag is set, it returns ExceptionContinueSearch; otherwise, it prepares to execute the handler (that's what the CLD and call to System::_16542 are), tries to crack the exception (with special care taken to looking for the 0xEEDFADE code), and calls UnhandledExceptionFilter only if it doesn't think it's one of the Delphi exceptions. Not only that, but they all return ExceptionContinueSearch as the default return value. Now compare all of that behavior with our misbehaving handler encountered earlier.
So, in the end we end up not knowing much. Our misbehaving handler in some ways resembles the other handlers, but in other ways not. That's the evidence, but it's not enough to conclusively determine whether this was generated by the compiler or by a person. In either case, I'm fairly confident that it's wrong.
*ahem* Moving on... what do you suppose that exception was, and who threw it, anyway? Here's who (421829, to be precise):
.text:004217D1 mov ecx, eax
.text:004217D3 xor edx, edx
.text:004217D5 mov eax, ebx
.text:004217D7 call unknown_libname_85
.text:004217DC cmp dword ptr [ebx+4], 0
.text:004217E0 jge loc_421893
.text:004217E6 lea edx, [ebp+var_18]
.text:004217E9 mov eax, esi
.text:004217EB call @Sysutils@ExpandFileName$qqrx17System@AnsiString ; Sysutils::ExpandFileName(System::AnsiString)
.text:004217F0 mov eax, [ebp+var_18]
.text:004217F3 mov [ebp+var_14], eax
.text:004217F6 mov [ebp+var_10], 0Bh
.text:004217FA call GetLastError_0
.text:004217FF lea edx, [ebp+var_1C]
.text:00421802 call sub_40E28C
.text:00421807 mov eax, [ebp+var_1C]
.text:0042180A mov [ebp+var_C], eax
.text:0042180D mov [ebp+var_8], 0Bh
.text:00421811 lea eax, [ebp+var_14]
.text:00421814 push eax
.text:00421815 push 1
.text:00421817 mov ecx, off_56ED00
.text:0042181D mov dl, 1
.text:0042181F mov eax, ds:off_41B358
.text:00421824 call @Sysutils@Exception@$bctr$qqrp20System@TResStringRecpx14System@TVarRecxi ; Sysutils::Exception::Exception(System::TResStringRec *,System::TVarRec *,int)
.text:00421829 call sub_404B00
.text:0042182E jmp short loc_421893
The branch at 4217DC seems to be not taken. Looks like unknown_libname_85 (among other things) stores its ecx param in its [eax+4], the latter being what gets checked at the compare in this block. This value comes from just above the previous block:
.text:004217B2 push 0
.text:004217B4 push 80h
.text:004217B9 push 2
.text:004217BB push 0
.text:004217BD push 0
.text:004217BF push 0C0000000h
.text:004217C4 mov eax, esi
.text:004217C6 call @System@@LStrToPChar$qqrv ; System::__linkproc__ LStrToPChar(void)
.text:004217CB push eax
.text:004217CC call CreateFileA_0
So... we have an exception getting thrown when CreateFile fails, where CreateFile tries to open the file for both reading and writing. Are you pondering what I'm pondering? Let's take a peek at what file it's failing to open, and find out if I'm right. Step into ExpandFileName and... "C:\Program Files\COMCASTTOOLBAR\Cache\COMBOSEARCH.acs". The shock is enough to kill you - it's a directory that limited users can only read from, and I'm a limited user.
So, in the end it turns out that we are indeed dealing with an idiot (or maybe more than one). Yeah, I know I never got around to the series on writing limited user-friendly programs (it's still on the todo list), but here's the gist of it: don't write outside the box, and Program Files sure isn't in your box (although I suppose that's not quite as bad as one guy at work I nailed for writing some temporary files in the Windows directory).
That leaves just one more question to answer (at least, one more to answer that we actually can answer): have they fixed this, yet? This Comcast toolbar was installed on this laptop when I got it from work. I have no idea how recent it may be. So let's see about downloading the latest version of the Comcast toolbar, and checking if it plays nicely, this time.
My version of the toolbar is 4.0.2.201. The most recent version is... 4.0.2.201. But just to make sure, let's run IE one last time, after intalling the newest version. It looks like the toolbar failed to even start up, staying in some kind of "loading..." mode (though at least it didn't crash). I bet that's due to exactly the same problem, where it's trying to write to the Program Files folder the first time it's run after installation. Running it once as admin confirms this. And running it one more time as a limited user confirms that the obnoxious crash indeed remains in the latest version.
So, I give it a D+ for limited user compatibility.
Search This Blog
Monday, July 10, 2006
Subscribe to:
Post Comments (Atom)
1 comment:
Awesome write up of the problem. I can't follow the code that great but your text descriptions are just great. I have the same error involving combosearch.acs but do not have the comcast tool bar however I do think it is a toolbar issue. Thanks for pointing me in the right direction
Post a Comment