Well, after messing with GNUPlot for most of the day, I believe I've obtained one of the results I sought: the regression equations for the different allocators. I cut off data points above 100,000 cycles based on looking at a couple plots; nearly all of the points were below 100,000 cycles, but some were significantly above that, and I wanted to minimize skewing due to outlier points. Nevertheless, there was a very large amount of variance in the data set (especially in the case of SMem), so you can expect a fair margin of imprecision. As well, I used the non-linear least-squares algorithm supplied in GNUPlot, which is somewhat susceptible to outlier points; at some point I may try recomputing the regression values using a robust regression method (something less susceptible to outlier points).
I suppose I should be getting used to the results of various experiments in this project surprising me, but I'm not; and this set of data came as one of the biggest surprises, after all the previously collected data. All regressions were performed using the general form m*(N^xp) + b, where m, xp, and b were solved for by the regression algorithm. I performed the graphs of the regression lines as log-log to maximize visibility of differences over the greatest range of points; you should not mistake these for linear graphs (for example, every single regression line was O(N^x), where x was less than 1, usually around 0.5; the log-log graphs, however, seem to suggest x is always greater than 1).
Most notably, SMem (Storm) takes a substantial lead at upper block sizes for allocations and frees (above 220 KB and 371 bytes, respectively); as well, it's not significantly worse than the other heaps at reallocations above about 5 KB. And no, the Windows heap doesn't really get down to negative cycles for allocations approaching 0 bytes; that's just an artifact of the regression calculation.
So, why did Storm perform so poorly in the bar charts? Well, that's relatively easy to offer an explanation for: it's probably because most of the allocations were smaller than Storm is optimal for. Though this does make me think that I should do some real-world tests, like replace all of SMem in Warcraft III with the other allocators, and see how long things like starting up, loading a map, and completing a map take (in other words, looking at clusters of operations that naturally occur together, which my benchmark is unable to take into consideration).