Google
 

Saturday, September 19, 2009

The Taxonomy of Linking

There are generally at least two steps to building a program, regardless of language. Compiling takes a source code file (operating on one file at a time), parses it, and produces machine code (or some manner of other thing such as Java bytecode) in what's called an object file. Here, there's one object file produced from each source file.

Linking then occurs on the set of object files for a project, producing a single binary. 'Linking' refers to connecting the code between object files, so that functions in one file may access functions and data in other object files, which the compiler couldn't do because it works on one file at a time. The linker also adds the operating system-specific wrapper material needed to allow the OS to load and execute the binary.

That's the general idea, anyway. As with most things, reality is more complicated.

What was just described is known as static linking (or build-time linking). Static linking is when called code is included directly in the binary. It's loaded and unloaded with all the other code in that binary, and as such is always available. The code may either come from object files, as above, or from what are called static libraries - essentially archives containing multiple object files combined into one.

However, there are disadvantages to static linking. Because the code is tightly integrated into the binary, it's effectively impossible to update any functions without recompiling the entire binary. Furthermore, because the code is included in each binary, functions used by many binaries must be duplicated, wasting space.

The alternative to static linking is dynamic linking. In dynamic linking, common functions are built into stand-alone binaries called dynamic link libraries (DLLs). Functions in these DLLs, unlike in static libraries, are not integrated into other binaries by the linker, but instead are called directly from the other binaries. This requires the functions be exported - a process in which the linker writes information into the DLL about what functions the DLL contains and where those functions exist in the DLL; this data is then used by the operating system to allow other binaries to locate and call these exported functions (importing). As DLLs exist independently of binaries that call exported functions, only a single copy of exported functions need exist on disk or in memory, and DLLs may be updated independently of other binaries.

Dynamic linking may further be classified as load-time linking or run-time linking. In load-time linking the linker, when creating a binary that calls functions in DLLs, creates an import table, which lists all DLLs and functions therein that the binary requires. The operating system then, when loading the binary, automatically loads all of these DLLs and finds the addresses of all imported functions, writing the addresses to fixed places in the binary; the binary may then trivially call the functions through these pointers that are at known locations. This is entirely automatic; the DLLs are automatically loaded when the calling binary is, and unloaded when the calling binary is unloaded (assuming no other binaries still need them), ensuring they're always available when needed.

To accomplish this, the linker needs what are known as import libraries. These are generated by the linker when it builds the DLL, and contains a list of all functions exported by the DLL. The reason this is necessary is that DLLs themselves contain only a function name (or ordinal number) - only enough information to allow the operating system to locate the desired function; they may not include information such as what parameters a function takes or the calling convention to use to call the function, which are necessary to seamlessly link to the function. A consequence of this is that both the DLL name and all functions needed must be known when the calling binary is linked.

The other type of dynamic linking is run-time linking. Unlike load-time linking, in which the entire importing process is automatic, the run-time linking process is entirely manual. You must manually load the DLL when you need it, call OS functions to get pointers to the desired functions (which you must manually specify by name/ordinal), manually call the functions through the obtained pointers, making sure you use the right parameters and calling convention, and finally manually unloading the DLL when you no longer need it.

As this is much, much less convenient, you generally use load-time linking whenever possible, reserving run-time linking only for those times when you can't use load-time linking for one reason or another. Generally this is because you don't know the name of the DLL (or the function to import) at link-time. There may be multiple DLLs with different versions of a function (e.g. in Diablo II there are DLLs for the graphics system, with separate DLLs for DirectDraw, Direct3D, and Glide); the DLL may be loaded based on user action (e.g. loading plugins); the DLL may not exist until run-time (e.g. the calling binary must download the DLL first); etc.

(for more general information on DLLs, see The Old New Thing)

Sometimes, however, you get in the unenviable position of not being able to use either. This is the situation I found myself in when linking ThunderGraft to the FMOD Ex DLL. Because the DLL will be packed into a Self-Executing MPQ and extract as a temporary file with a random name, I couldn't use load-time linking (which requires a fixed DLL file name). However, the fact that the DLL was exporting full classes meant that I couldn't use run-time linking either, as the names were too complex to manually specify (with such memorable names as "?createSound@System@FMOD@@QAG?AW4FMOD_RESULT @@PBDIPAUFMOD_CREATESOUNDEXINFO@@PAPAVSound@2@@Z" [spaces inserted so the blog formatting doesn't get all screwed up by the huge string]). Static linking was also out, as the authors do not make a static library of FMOD available (which isn't too surprising as static libraries tend to be compiler- and version-specific).

Fortunately, there's one more option: delay-loading. Delay-loading is essentially a neat linker trick. It uses an import library for a DLL, but instead of generating import tables it generates code that loads the DLL and get the imported function addresses through the run-time linking functions; the DLL is loaded and the function addresses obtained when an imported function is first called (or you tell the linker code to do so at a time of your choosing). This allows the convenience of load-time linking with the flexibility of run-time linking, and allows you to get out of sticky situations like this one.

Of course, in this case it was all due to poor design in FMOD to begin with. It's a well-known principle that you should never export full classes; a much better way is to export pure virtual classes. This avoids the problem because functions in virtual classes are not exported individually, but rather accessed through the v-table; no exports to link to, no names to mangle, etc. It's to avoid this very problem that COM chose to use pure virtual classes as the basis of the COM object system, after all, and there really isn't any better alternative.

Tuesday, September 15, 2009

& Hobbit Birthday Parties

There probably aren't many that care that wouldn't see my post on Campaign Creations, but here it is just in case:
Well, it's that time of year again, and I've got some presents to give out. First, I released build 2009.09.13 of MPQDraft a couple days ago. While it only fixes bugs, it fixes some major ones - so major I'm amazed nobody told me about them in the last year and I had to find them myself.

The bugs fixed, in approximate order of severity:
-SEMPQs not activating for games specified with relative names
-plugin page not setting the plugin pointer after selecting with Browse, resulting in a crash
-plugin page stops listing plugins after one fails to load
-not saving custom executable names in the SEMPQ wizard
-plugins weren't forced to use proper plugin IDs for their modules, allowing some plugins to get away with bad behavior that didn't work with FireGraft (which forced plugins to use the correct ID)

As with the last year and a half, MPQDraft is open-source on SourceForge. You can get both the binaries and the source from the files page.

Second, ThunderGraft has been resurrected from the dead. For those who haven't heard about ThunderGraft before, it allows the use of MP3s, Ogg Vorbis, and other sound compression formats in Diablo, Diablo II, StarCraft, and WarCraft II: Battle.net Edition, which give higher audio quality and smaller file sizes than the WAV compression used in those games normally; ThunderGraft also integrates with StarEdit to allow maps to use non-WAV sounds in triggers. I'm calling it a beta version as ShadowFlare just yesterday mentioned a crash that only occurred on her computer, and I probably won't have time to investigate it until at least tomorrow.

And finally, as sort of "present #2.5", ThunderGraft has been open-sourced as well, and it's also on SourceForge. The binaries are available on the files page, but I haven't gotten around to putting a ZIP of the source up yet. Specifically, I really, really need to clean up the directory structure of MPQDraft and ThunderGraft to make it much easier for people other than me (without my directory organization) to compile it. You can still view the source through the source browser, or download it with Subversion or CVS, but you probably won't get it compiled without a bit more info from me; I'll get to that when I have time.

As always, there are two flavors: release and debug. I'd advise using debug for mod development, as it generates log files that are very helpful in fixing bugs, should you find any (but would probably annoy people who download your mod).

Enjoy!

Monday, August 31, 2009

& Very Old Things - Part 3

So. Thus far everything I did involved nothing more than the memory editor and cheat search. In other words, something just about anybody could do. To really say that I reverse-engineered Blaster Master, I needed something a lot bigger, and I knew just what to do: I was gonna find the cause of a 21-year-old bug. There's a well known glitch in Blaster Master that on some bosses, if you or the boss is being hit when you pause the game, you (or the boss) will continue to take damage while the game is paused. This only works on some bosses, however, leaving a big question mark as to whether it's a bug or a feature (it was especially strange that it affected both player and boss); I wanted to find out.



Unsurprisingly, it's quite a bit of work to reverse-engineer something you have absolutely no info on (e.g. in computer programs you at least know OS API calls the program makes, and can use that as a starting point), especially when you're learning both the hardware and the assembly language as you go. Ultimately, it took me an embarrassingly long time to get the job done (certainly more than it was worth), and I ended up disassembling a lot more of the game than was necessary (as is often the case when you don't have any idea what you are looking for). I'll just talk about some of the more important lines of investigation, findings, and complications.

As I didn't know anything about the game, naturally the first thing was to hit "step into" and see where I ended up. I always ended up in the same place: a busy loop that checks a single value. Given the nature of this and some idea about the hardware, I reasoned that this was a loop to wait for the next vertical sync, to start on the next frame; this was confirmed by observing that address being written to from the non-maskable interrupt (vsync) handler.

From there I stepped out of the function and took a look around, writing down the various functions called after vsync. I then went about refining the list and detailing more levels of the call tree, ultimately (after all of my disassembly) resulting in the call tree and other related notes below (note the comments have a focus on things relevant to finding this pause glitch). One thing of particular importance is that there's an object-oriented handler for each object type, and the appropriate handler is called for each non-empty object table entry.

$E936 waits for vertical blank to occur
$C971: for each object calls $C8DF, decreases hit timer if nonzero, calls $C9A4, and calls $C928
-$C8DF copies object data from $400+$56 to $46 ($C925 writes life)
-$C9A4 sets hit timer back to full and decreases life in the object buffer when hit with hit timer at 0 after decrement
--$EA3A
--$EB51 saves the object-specific handler to $7A (LE)
--jumps to the object-specific handler at $C9D3
-$C928 copies object data from $46 to $400+$56
$CA4B->$EA3A->$E61B->$E63C/$EB98 unmaps $8DF6 page [the page the object handlers are], which gets mapped by the NMI handler

$EB7E NMI handler (does not branch)
$EB97 IRQ/BRK handler (stub)
$F7: controller 1 state (bits in reverse order)
$F8: controller 2 state (bits in reverse order)


For a while I did some looking around with breakpoints on the location in the object buffer where the life is stored, looking for things that modify it. Some of the information gathered this way:

Copy of object struct for current object: 46
Buffer used to hold 16-bit pointers [LE] from various tables: 7A

$53 [life of the current object] written to directly by $8DF6 in $C9A4 when damage is taken
$8C6B-8C9F, $8DDF-8DFE (inside $8D76) only executed when hit timer is 0
During pause taking hit, $C1E6->$D7A0 returns A0/N+ to $8DD5 on boss 2, but 7F/N- to boss 5

$D7A0 seems to determine whether you get hit during pause. $7E indicates whether you're being hit or not: high bit for hit, lower bits for damage taken.
$7E is written to by $D71A in ($C141->$D711) when hit by boss, $D7B2 to clear hit per frame. $D71A is not reached when paused with boss 5 because $D6CD does not return
$D6CD returns if the current object is hitting the player, throws an "exception" returning to caller of caller if not hitting (returns from $D710)


Eventually I decided to take a more systematic look at the object handler for the catacombs (overhead view) player object:

$8C38->8D98->$8DBF catacombs player object handler paused
-$C15F
-$C0FF->$EF2B sets $3E and $3F to the X and Y coordinates (in pixels on screen) of the player for $D7A0
-$C1E6->$D7A0 returns A0/N+ to $8DD5 on boss 2, but 7F/N- to boss 5, when paused and being hit. A indicates the damage player should take. highest bit of A and the negative flag indicates whether player is hit or not.
-on N+, $C216 then handles hit (including decrease life)
-$93BD
-$EA3A
-$F029
-$E63C


From this we can see that $C1E6 is the switch we're looking for, determining whether the player takes damage; though a look through the function shows a clear lack of anything resembling hit detection - it simply reads from a struct in memory at $7C, generated who knows where. So, if that isn't determined in the player object handler, maybe it was handled in the objects you can be hit by. For that reason, and to satisfy my curiosity a bit about what makes bosses tick, I decided to take a look at a boss handler function; and as I hate the crab boss so much, naturally I went with that, first.



It was about this time that I discovered that NOPing function calls was a very effective way to quickly get an idea what a function is doing. This method works poorly on computers, because of the greater number of registers and drastically greater stack usage, making such NOPs result in crashes more often than not. On the NES (or at least in Blaster Master), however, the method works very well, with crashes very uncommon.

Crab (level 5) boss unpaused object handler ($A6D0):
-$A07B: (no discernible effect)
-$A6DE: handles movement of crab and state transitioning (but not animation)
-$A7A2: spawns bubbles when necessary
--$C1F5: spawns a bubble
-$C0FF: updates crab screen position based on crab 65 position (NOPout fixes crab position where it shouldn't be, despite crab 65 object movement). affects everything onward.
-$C153->$EB51
-for each body segment (right pincer, left pincer, back):
--$C141->$D711: detection of whether player has been hit. detection of player being hit by green sprite is elsewhere.
---$D6CD: checks for collision
--$C216->$DECC (only if hit)
-$C12C: (no discernible effect)
-$C189: draws (animates?) the green sprite
-$C144->$D697: sets player damage if player hits green sprite. also handles things damaging the boss.
--$D6CD: returns on collision (either player hit by something or boss hit by something). eats the stack frame if no collision (returns to caller of caller)
-$9EB3: animates the background layer


Okay, so $D6CD is another function that looks like a good candidate. So, let's take a look and see if it's what we're looking for or another tangentially related function (don't blame me for the formatting, Blogger doesn't like indentations).

$44 = A
$00 = $3E - ($40 / 2)
$01 = $3F - ($41 / 2)
X = 0xF
do
{
A = $7E[X]
if (A > 0)
{
$45 = A
A = $7C[X] - $00
if (A <= $40)
{
A = $7D[X] - $01
if (A <= $41)
return
}
}
X -= 3
} while (X >= 0)

throw


Now that looks like a collision detection function, more or less. We can see the X coordinate, Y coordinate, width, and height of whatever it is we're comparing against at $3E-$41, respectively (this is a case of a memory region being used like a stack frame). We can also see an array of 5 structs containing the X coordinate, Y coordinate, and a thingy, respectively, starting at $7C. Clearly only the size of the "current" object is considered, and the array of thingies are just points.

Well, let's think about it for a minute. We know that this function is called 4 times for the crab boss, for the 4 different parts of the body (and that one of those is not like the others). What might collide with the boss that would be of interest? Well, the player and the player's projectiles. A bit of playing around proved this to be true: the first entry in the array is the player, the rest are the player's projectiles (limited to 4, as you can see). That third byte in the struct array indicates the amount of damage the projectile does (or 7F for the player).

So, now we have everything we need to figure out how this works. The boss object checks for collisions with the player and the player's projectiles. If it hits the former, it marks the player for damage, which is then applied by the player object handler (presumably next frame, since the player handler always gets executed before enemy handlers); if it's the latter, the boss takes damage, though only if the part hit is the green sprite, which is the hit box.

Now that we know how the collision detection and damage system works, and why bosses that are susceptible to being hit while paused also themselves can hit you while paused (the collision detection for both is one in the same); that just leaves the question of why it only operates on some bosses. Well, it was about here that I discovered that that wasn't actually true. While only some bosses can be hit (and hit you) while paused, this is not the case for projectiles. Specifically, the projectiles fired by boss 3 and boss 8-1 can hit you while paused, even though the boss cannot be hit while paused.

Well, it turns out there are two sets of object-specific handlers, one for when the game is running, one for when it's paused. Now, bullets and other things are generally not prone to the problem of dealing damage while paused, as they disappear as soon as they hit something (e.g. the bubbles spewed by the crab boss). The things that can hit you while paused appear to be exactly those objects that have pause handlers and that don't disappear after hitting you; this is the case with all susceptible bosses, plus the projectiles of bosses 3 and 8-1.
Boss object handlers:
1 (63): base $A58A, paused stub
2 (5F): $9B64
3 (61): $A196, paused stub
4 (5D): $970C
5 (65): $A6CD, paused stub
6 (5F): $9B64
7 (5D): $970C
8-1 (67): $AA34, paused stub
8-2 (69): $AC84, paused stub


While I didn't go searching for every single object handler to compare the paused and non-paused versions, I did compile a list of the ones for the bosses. Note that there is only actually a single table of handlers in memory; the address in the table points to the paused handler, while the unpaused handler is always 3 bytes afterward (3 bytes is the size of a JMP instruction). For the bosses not prone to this glitch, you can see that I've indicated the pause handler is a stub that returns immediately. For the rest, the pause handler jumps midway into the unpaused handler (after things like movement).



To illustrate this, take a look at the handler for boss 2 ($9B67):
-$A07B: (no discernible effect)
-$C01E: handles movement of boss
-if time to spawn next projectile:
--$C1F5->$D851: fork projectile from boss, giving it a new proto-object type
--$C216
-paused handler jumps directly to here
-$C0FF
-$9D77: draw and do hit detection for arms
-$C090->$D770: do hit detection for back (not hit box)
--$D711: hit detection
-$C093: do hit detection for hit box
-$9EB3: moves and animates the background


So, that's how it is. Now we know why it happens and why it only affects some things. That just leaves one more question: is it a bug or a feature? Unfortunately, the evidence is much too ambiguous to answer that clearly. While it's within the bounds of imagination that damaging the boss during pause could have been intentionally put in as a cheat of sorts, it's very hard to imagine that the same thing against the player would be intended.

Yet it's also hard to imagine that it could be a bug; while the player being hit during pause could be eliminated by a single change to the player handler (a single mistake, in other words), doing the same for bosses would require that every boss's handler be changed. Furthermore, there's the fact that only 2 of the 5 unique boss object handlers (2 of them are used for 2 bosses each, bringing the total to 9 bosses) have pause handlers, in contrast to the player and most projectiles, and without pause handlers the bosses don't even get drawn completely (and what point is there in having a pause that lets you see the boss battle if you can't see the boss?). It seems as though a lot of stuff was left out or in arbitrarily; I have to wonder if there is a deadline lurking in here, somewhere.



Various resources used during reverse-engineering:
Nintendo Entertainment System Documentation
6502 Instruction Summary
How NES Graphics Work
Comprehensive NES Mapper Document
For more info, see this big list of documents

Saturday, August 29, 2009

& Very Old Things - Part 2

As for the hacking itself, this started out with simple stuff. Search memory, find special values (life, lives, gun power, boss items, etc.), and overwrite them. This was quite easy with an emulator that has a cheat search feature; you simply have it save the memory contents at the initial state, the repeatedly perform searches based on specified criteria until you've narrowed it down to 1 or a small number of possibilities. For example, finding player health was a trivial matter of searching for something that decreases each time you take damage and stays the same in between damage.



With the exception of the boss items, all of the others in this list were similarly easy to find. Boss items, however, were a bit more complicated. Because you only get the items after beating a boss, naturally this is a situation where there are likely to be a lot of changes all over RAM, as it's a major transition in the gameplay; this means that you can't narrow it down as easily. I had to try several locations, starting with the ones whose values made them the most probable to be correct (specifically ones that had a bit flipped on when the item was picked up).

However, there was one other complication figuring that one out: boss items are represented by two separate bit masks that are partially redundant. I first discovered the mask at 3FC, because that's the one that changes the bullet your tank fires after you get the first boss item; as it was a visual change, it was very easy to see that it was working. Once I found that, I checked to see if it was a bit flag, and confirmed it was, finding the flags for the second and third boss items by experimentation.

That's where things got weird. While this mask contained flags for boss items, the flags for the other boss items didn't seem to actually do anything. As well, while having the hover bit flag set caused hover power to be displayed (it's hidden prior to getting the third boss item), you couldn't actual hover. The bits here also did not make the upgrades show up on the pause screen. So, I went on, with my infinite life and gun power, and killed a few more bosses (somehow I still remember most of the maps of this game).

I found that there was a second bit mask with a completely different set of flags, one for each of the 7 items in the game. For the first 2 items, these flags did nothing more than make the item show up in the pause screen. For the third item (hover) this flag allowed you to hover, but did not show the hover power bar or allow you to accumulate hover power (so you still couldn't actually fly with this flag alone). I still can't imagine why it was done this way, but at least I figured out all the flags.

One thing that I couldn't figure out with cheat searches, however, was boss life. That is, I found it - repeatedly: it was in a different place every time (including even for the same boss). I correctly reasoned that this implied that the boss's life was not a special value, but that bosses were simply common entries in the list of objects in the game. As it turns out, so is the player; you just don't notice that because the player is always at index 0 in the object array and thus always at address 0x400, while the boss was stuck in whatever slot was available at the time. This led directly to the discovery of the object array itself. From there, I started looking at the struct that represented each object, beginning with offset 0xD: life.

As I've already pointed out that you tend to have to get clever when coding on a system with so little RAM, it shouldn't be surprising that maps are stored in a similarly clever way. As far as I can tell, there is only a single map for each area in the game (8 areas), which is 128x256 or less. However, the map is diced up and jam-packed in there such that they need to do some camera tricks when transitioning between areas (teleporting through some doors) to make sure you never see other sections that are contiguous in memory.

While I was working on it, I figured I should try to answer some long-standing questions about Blaster Master. To begin with, why does the power of your gun (what it actually shoots) not always match the gun power that's displayed? For example, if you have full gun (8 bars) and get hit, you take 1 bar of damage, but the effect of your gun falls beneath what 7 bars should give you. Many years ago I hypothesized that there were two variables for gun power; one was the one displayed, and one that was the actual effect: getting hit did a bit more damage to effect than the display.

It turned out to be simpler than that. It's a consequence of the fact that while most hits (I can't think of any counterexamples) do an integral number of bars of damage, life, gun, and hover are stored from 0-FF, in bytes. See where this is going? One bar = 20, but 8 bars = FF, not 100. In other words, the number of bars displayed is (x + 1) / 20. When you get full gun and then take a hit, your gun drops to DF; while this still displays as 7 bars, it doesn't give you the maximum gun effect (which requires E0). It's not clear whether this is a bug or a feature, though I'm kind of leaning toward a bug (specifically, an edge case the coders didn't think of, which could be easily solved).

All numbers are in hex.

Continues: 37E
Lives: DD
Gun: C3
Hover: 92
Bosses killed: 3FB. bit x = boss x+1
Boss items 1: 3FC. same values as 3FB.
Boss items 2: 99. bit 0: hover, 1: dive, 2: wall 1, 3: wall 2, 4: crusher, 5: key, 6: hyper
Homing Missiles: 6F0
Lightning: 6F1
Thunderhead Missiles: 6F2
Pause: 15

Gun Levels:
20: 2 long shots
40: 3 long shots
80: circling shots
C0: white wave shots
E0: full wave shots


Beginning of Object Table: 400. 12 slots. Slot 0 is reserved for the player. Slots 1-4 are reserved for projectiles. Enemies fill slots 5-11 as needed.

Object Struct (14 bytes)
byte @0: Object type. A few of the known object types:
3: Normal
4: Dying
5: On left wall
6: On right wall
7: On ceiling
8: Diving
9: Climbing H->V convex
A: Climbing V->H convex
B: Climbing horizontal->vertical concave
C: Climbing V->H concave
D/E/F 10/11/12: Entering/transitioning/leaving door right/left
1B: Outside tank
1D: Entering tank
84/85/86: Entering/transitioning/leaving door catacombs

byte @1: Orientation. The exact meaning of this seems to vary by object type.
Tank mode: Between 0 for left and F for right (the intermediates indicate that the tank is changing face), with the high bit set if the tank is pointing to the left
Catacombs mode: 0 for up, 1 for right, 2 for down, etc.
uint16 [LE] @2: Horizontal position. The high byte indicates the attribute-size tile, while the bottom byte is the position within that tile. The highest bit indicates that you're currently moving through a teleporting door.
uint16 [LE] @4: Vertical position
byte @6: X velocity
byte @7: Y velocity
byte @8: Attribute-size tile index on screen? This was always observed to be (Y * 0x11) + X where X and Y are relative to the tiles visible on screen.
byte @9: Hit timer: the number of ticks until the object can be hit again. Controls invulnerability, damage flashing, and also decreases gun by 1 for each tick above 1 (this causes gun power to drop smoothly rather than suddenly, though I'm not sure why they did it this way; life does not do this when damage is taken).
byte @A: Current AI state
byte @B: Frames till AI state change
byte @C: ?? (no observed effect)
byte @D: Life. For trivia value, normal tank shot does 4, hyper shot does 6, crusher shot does 8, normal catacomb shot does 1, catacomb shot with wave gun does 2, and grenades do 2.


Note that the AI state bytes in the struct are for simple enemies. Bosses have other state stored elsewhere. For example, in the above crab boss, the AI state and timer control movement but not spraying bubbles, which occurs entirely independently. The density of bubble spray, however, was directly controlled by the life of the boss; e.g. below 0x20 life (it starts with 50) it does the full sawtooth volley with dozens of bubbles.

Wednesday, August 26, 2009

The Story of ThunderGraft

For those who aren't familiar with it, ThunderGraft was possibly the coolest of the three modding tools I released (the other two being MoPaQ 2000 and MPQDraft). Diablo, Diablo II, Starcraft, and Warcraft II: BNE all use PCM WAV audio (22khz 16-bit if I recall correctly) for their sound effects and music. Diablo used raw PCM WAV audio, while Starcraft introduced (and the other two used) a special "WAV compression", which compressed the audio in lossy ADPCM form, resulting in a compression ratio between 3:1 and 4:1 (though at the cost of audio quality); WAV compression, however, was implemented transparently in the MPQ API - the games saw what they read and wrote to the MPQs as simply PCM, and that's all the audio streaming API was capable of playing (in other words, the audio streaming functions were identical in all four games).

ThunderGraft was a utility that added the ability for all four of these games to play MP3s and Ogg Vorbis, in addition to standard PCM WAVs; even better, it did this in a version-independent way* (the very same ThunderGraft binary worked for all versions of all of those games). Naturally this was huge, especially in the days of dial-up (which was when the modding community first became big). You could use modern audio compression formats that yielded significantly smaller file sizes and higher audio quality compared to WAV compression. Yet ThunderGraft is all but dead now. Why is that?

MPQDraft and ThunderGraft have a few things in common. They're both very simple, clean, version/program independent, and highly effective. The reason is also the same: they both rely on very clean exploits of design to do their thing, without getting into messy exploits of implementation that depend on the precise binary patched. MPQDraft exploited the priority system in the MPQ API; ThunderGraft exploited the fact that the audio streaming API was encapsulated in functions supplied in Storm.dll.

The fact that the functions were entirely contained in Storm meant that I could cleanly capture calls to them and redirect them. For cleanliness, I opted to simply replace the streaming API in its entirety. Anything less would have been version-dependent, as it would have required all sorts of messy code modifications inside Storm's internal functions.

This meant that I needed my own decoding and streaming code to replace Storm's. At the time I just coincidentally happened to have such a thing handy. Specifically, a friend of mine by the moniker Dark_Brood was writing a game engine called Aegis, and happened to have audio decoding/streaming code handy for me to plug into ThunderGraft. He sent me a static library of the code, and in it went. Easy.

Where things took a turn for the worst was when he wrote the next iteration of his game engine. This involved rewriting a whole bunch of stuff, and integrated the various systems in the engine much more tightly (e.g. added garbage collection and other global things). This meant that it was no longer possible to simply extract the audio portion of the engine.

While in theory I could have just continued to use the old version, there was a big problem: I never had the code for the original version, nor did he save a copy after rewriting the code. This is a big problem because static libraries are compiler- and version-dependent. Now the only way I could even compile ThunderGraft was on the very same version the original was compiled on: Visual C++ 6, which is some 11 years old, now, and I haven't even had it installed for many years.

Thus, the only way to resurrect ThunderGraft would be to replace the decoding and streaming system entirely, and thus far I simply haven't managed to muster the effort. After open-sourcing MPQDraft I wanted to do the same with ThunderGraft, but was unable to readily do so for the same reason.

*There is one thing that's version-dependent in ThunderGraft, as there's simply no theoretical way to do it in a version-independent way: importing of non-WAV audio files into maps with StarEdit (the Starcraft map editor). Special support for this was required for several reasons: 1. StarEdit verifies that things imported are WAVs and refuses anything else, 2. it needed to know how long the audio file was in order for triggers to work right. The audio decoding and streaming in ThunderGraft, in contrast, was truly game- and version-independent.

Sunday, August 23, 2009

& Very Old Things - Part 1


Bring back any painful memories?

So, a couple days I had one of those evil, perverted urges I'm so notorious for. Specifically, I felt like doing a bit of hacking of Blaster Master. For those not familiar with it, it was an NES game published 21 years ago. A good one, in fact. But before I talk a bit about anything I did resembling reverse-engineering, let me explain the NES hardware to the best of my knowledge (which consists of what I saw in my hacking, and a bit of stuff I looked up online).

The main CPU of the NES is a modified 6502, the CPU used in various Commodore and Apple computers (not that I ever used one of those). The 6502 is a primitive CISC CPU, and in fact somewhat resembles the x86 (though more primitive). It has 3 8-bit registers: a numeric accumulator (A) and 2 index registers (X and Y). Operations generally operate on the value of A and a memory location or literal, and write the result back to A.

It has a stack, though from what I saw in Blaster Master it seems to be relatively rarely used, used mainly to save registers (though even that was pretty rare). Specifically, I didn't see much in the way of stack frames holding local variables. Functions that require local variables tended to have (dedicated) memory ranges allocated to them to be used as private buffers. For example, one address I initially thought was the boss life location actually turned out to be in a buffer region used by the function while updating objects in the object list. This kind of surprised me, as the NES has only 2 KB of RAM; I imagine this must be forced by the architecture, e.g. a lack of instructions to access the stack at a specific offset.

As for the graphics system. If you ever messed around with text mode and text graphics in DOS, the NES graphics system is similar. At the lowest level, graphics are stored as blocks called patterns. Each pattern is 8x8 pixels, 2 bits each (total of 4 colors). The NES has enough video memory for 512 patterns at a time, stored in 2 pattern tables. What exactly it does with these patterns depends. The NES has 3 graphics layers: 1 for background and 2 for sprites (1 in front of and 1 behind the background layer).

deconstructulator provides an excellent illustration for most of the things discussed here. It shows the pattern tables (note that the top half is the sprite pattern table and the bottom half is the background pattern table) and sprites along with a playable version of Super Mario Bros.; although it does not show the name/attribute tables. Refer to that from here onward.

The background layer (illustration below) is very much like the screen buffer in DOS text mode; it's a buffer of screens composed of 32x30 patterns each (indices into the pattern table), called name tables (lost in translation, perhaps?); all patterns in the background must come from the same pattern table. The NES supports up to 4 screen buffers in this way, though it only contains enough memory for 2 of them (if the cartridge doesn't supply additional memory, the game can simply set the NES to mirror the 2 screens).

However, so far each pixel only has 2 bits of color (the bits in the pattern table). An additional table, called the attribute table, supplies an additional 2 bits to the name tables, increasing the total to 4 bits, or 16 colors (though each tile on screen still has only 4 colors within it). However, the attribute table does not contain a value for every tile. Rather, it only contains 1 two-bit entry for each block of 2x2 entries in the name table (so 16x16 blocks composed of 4 patterns). This is the reason 16x16 blocks appear everywhere in NES games - that's the smallest block that can be fully independently controlled.

The background position is specified independently of the sprites. A horizontal and vertical scroll position is specified by pixel for the background, which indicates where the graphics chip should start drawing the background on screen. This is how smooth scrolling works, although implementing this can be quite a trick, as you have to essentially use the name tables as a ring buffer (even more tricky if you support scrolling in both dimensions).




A shot of the side-scrolling portion of Blaster Master (top) and the corresponding, merged name/attribute tables (bottom). The white lines indicate where the top left of the screen currently is (where the two lines cross). The visual discontinuities are caused by the way the game streams the background into the name/attribute tables on demand, filling in only what's on screen in the frame; they're stale data. The color glitches result when the current color in the attribute table for an entry does not match some of the stale patterns still in that name table for that entry (e.g. on the far left the second column of patterns is discolored because they are stale and the attribute entry now corresponds to the first column of patterns, which is on screen).

The NES supports 64 sprites, each either 1 or 2 patterns in size (8x8 and 8x16, respectively), depending on whether a global flag is set. Analogous to the attribute table, each sprite contains the 2 high bits of color (though still that allows only 4 colors per sprite). Other info each sprite possesses includes position (obviously), mirroring parameters, and whether the sprite is in the foreground or background. However, even if you can have 64 sprites, the NES only supports 8 sprites per scan line; if there is more than that, only the 8 with the highest priority are drawn (smart games can cycle through their sprites each frame, but that results in flicker).

There are, however, some tricks that can be done to circumvent some of the innate limitations (e.g. 16 colors, 16x16 blocks, 64 sprites, etc.). These often involve clever but perverse tricks related to swapping out the various things such as scroll offset or sprite data while drawing is underway. One common use of this is to draw part of the screen (the gameplay) with scrolling, while a portion of the screen on top or bottom is fixed (the score area). This can also be used to draw more than 64 sprites per frame, though there's still a limitation of only 8 sprites per scan line.



Blaster Master and others frequently use the background as a grotesque way to draw very large sprites. In the above screen shot, the boss crab is actually the background, while the bubbles, player, and energy bars are sprites (the green areas of the boss are actually sprites, as well). The crab has a couple frames of animation; each of these is on a separate screen in the name table, and every frame the scrolling position is modified. This results in the crab moving around the screen and animating, as if it were a huge sprite.

Oh, and in case you were wondering, yes, this boss frequently exceeds the 8 sprites per scan line limit (that would be 4 bubbles), resulting in the game slowing down to draw them all.

Thursday, August 06, 2009

& Summer 2009 Anime Music

As with last season, I was again inspired to post theme songs that I like of current anime series. As with last season, one show's music in particular provided this inspiration - Umineko no Naku Koro ni [When the Seagulls Cry].

Umineko no Naku Koro ni really walked away with the title of best theme music of the season, between the music not being bad, and the lack of any high-caliber competition. The opening especially appeals to me because one of the musical themes I go for is modern compositions/melodies in symphonic/operatic style. The ending I suppose would be classified as fun, in rock opera style with a singer that has the perfect voice to perform the the creepy insane rambling lyrics.

Opening theme: Katayoku no Tori [One-Winged Bird]


Ending theme: La Divina Tragedia


Perhaps given the lack of serious competition I may have lowered the threshold for inclusion in this post, I dunno. Anyway, next up is Kara no Kyoukai [Boundary of Emptiness]: Garden of Sinners: Oblivion Recorder. Kara no Kyoukai is a series of movies that form a single plot line. While some of the themes are shared between all of the movies, some are unique to the movie (in this case the one that just came out a couple weeks ago).

As you can see, the video was replaced with the logo of the movie. Note that I set it to start at 0:42, which is where the theme for this movie starts. The music before 0:42 and after 4:18 is from the main theme of Kara no Kyoukai, which I think it pretty bland.

Ending theme: fairy tale


The rest all fall under the category of fun/cute. Bakemonogatari [translated by some as GhoStory, which preserves the pun] has a pretty fun ending theme with a bit of cuteness. The video of the credits seems to have been replaced with a still of the picture that's in the background during the credits.

Ending theme: Kimi no Shiranai Monogatari [Your Strange Story]


Yoku Wakaru Gendai Mahou [Comprehensible Modern Magic] has a somewhat peculiar ending theme. Like the animation during the credits (which you just know is going to get made into animated GIFs), the music is overly cute, yet strangely catchy (although I can imagine it would make some people want to kill things).

Ending theme: Made in Wonder


Finally, Hayate no Gotoku!! [Like the Wind; this is a pun on the main character's name being Hayate]; first of all, note that the sound on this video is LOUD, so you might want to turn your volume down. This one really crosses the line from cute into obnoxious. While the melody is nice and fun, the "cute" accompaniment makes me want to smash things.

I can only figure that the producer of this series is an absolute copyright Nazi, because despite quite a bit of looking I can't find a single video of this ending theme with the credits animation, which is unfortunate, because that would show you who the infernal peanut gallery is. Instead, I found this video, and I really don't know where it came from. It's called a preview, but as I only saw one copy of it, I'm thinking maybe it's a fan-created anime music video. In either case, it seemed better than the large number of others, which simply played the music to some picture (the album cover, a picture of the character singing, etc.).

Ending theme: Mankai Watashi Iro [something along the lines of My Full Bloom Form]