One of the things that has always been nagging at me since starting development on Vespers is game performance. We haven’t really been developing with frame rate in mind, our thought being that we would leave optimization until we had most of the content plugged in. Most of that optimization would come from the graphics end — LOD, portals and zones, textures, things like that — but that’s a lot of work for the artist to do, and it’s not terribly exciting work at that.
Still, after trying out the game on a number of different systems, I was not very happy with performance even at this unoptimized stage. Frame rates on the better systems would rarely get to 30fps, even at lower screen resolutions. And in far too many areas, rates were commonly in the teens. In some places with a lot of objects in the field of view, rates would bottom out at 10 or less. Rates in the teens give a pretty choppy performance; rates around 10 are just unacceptable. And on one system, with a lower end graphics card, the game was completely unplayable with rates unable to get above 5.
So recently we started addressing optimization, first by adding some LOD to objects and buildings, and then tackling portals. Unfortunately, portals turned out to be an unwieldy beast that we just could not master. Setting up portals and getting them to work in the Torque Engine can be very tricky, especially for complex models like our monastery buildings — any tiny little misalignment, visible or not, and you’re out of luck. At this point, finding those misalignments would be like trying to find an unknown number of needles in a series of very large haystacks.
So I turned my attention elsewhere, and implemented code to replicate (for the most part) the function of portals and zones. It works well, producing nice frame rate boosts on most systems. Whereas before I was sludging along with rates in the teens and twenties, now I’m getting rates in the thirties and forties, and often more than that. I’m very happy with that.
But there was still one thing bugging me: these performance improvements were seen on my desktop (a PowerPC Mac G5) and on my laptop (an Intel Mac), but only on the laptop when running under WindowsXP. If I ran the game on the laptop under Mac OS X, I was still getting the same crappy frame rates as before. How could that be? Same laptop, same hardware, but under one operating system it runs fine, while under the other it runs crappy.
The desktop also runs Mac OS X, and both the desktop and the laptop have graphics cards with 256MB of VRAM. So I didn’t think it could be the operating system itself, or a lack of VRAM on the laptop. I thought it had something to do with the Torque Game Engine code, something specific to the graphics rendering on different platforms and CPUs, probably related to the rendering of buildings like the monastery. I went through round after round of testing, using various tools to analyze the code to see where the slowdowns occurred. Long, boring, frustrating work.
The only conclusion I came to is that the laptop was probably tricking the Mac OS into thinking the video card had run out of VRAM, even though it probably hadn’t. And this was likely related to Apple’s ATI graphics driver on the laptop, which is different than the ATI driver I use on the desktop (and different, of course, from the one used by XP on the laptop). A driver issue like that is a bummer, since there’s not a lot I can do about that. I was not very happy with that.
The only other thing I thought of trying was to mess with some of the global preference variables in the Torque Engine, to see if I could somehow improve performance on the laptop without sacrificing too much on the graphics end. There are a huge number of preference variables available — stuff related to interiors, lighting and shadows, terrain rendering, and OpenGL performance. The latter struck me as potentially useful, as it would probably be related to the driver. So I started looking at those more carefully, and I saw this one:
$pref::openGL::allowCompression = 0
That seemed interesting. Allow compression…likely related to texture compression. Textures can take up a lot of VRAM, especially if you use lots of them, and lots of large ones — like my artist enjoys doing. It makes for prettier graphics, but it rapidly eats up memory on the graphics card. And even though I didn’t seem to be running out of VRAM, on the laptop it seemed like the system was being tricked into thinking so, which would certainly cause these types of slowdowns. But what could this one little boolean variable do? I set it to 1, and gave it a try.
Amazingly, frame rates in general tripled, and often more than that. Whereas before I was getting annoying, choppy gameplay with rates in the low teens, now I was seeing smoothness with rates in the thirties and forties, sometimes more. And, as far as I can tell, no discernable change in the appearance of the graphics.
Even better, the game now runs on the older system with the lower-end card — and runs well, with rates in the twenties and thirties, even at larger screen resolutions. I never thought I’d see it running on that machine at all, much less running well.
It’s not a solution to the overall problem, which is likely related to Apple’s ATI drivers. But still, it’s an amazing improvement that appears to come at little to no cost.
And all due to one little boolean. A completely undocumented boolean, I might add.
Enjoyed this article? Subscribe to The Monk's Brew RSS feed.
Excellent post, Rubes.
There isn’t some minimum number of words I need to leave in comments, is there? 😉
You aren’t suggesting I’m far too verbose, are you? 😉
😀 – no, not at all. I’m never too sure how people feel about ‘good job!’ comments though.
I don’t know about other people, but I’m all in favor of them!
can’t wait to see a tech demo of this excellent IF!
Cool, thanks. I’m actually putting together a tech demo video as we speak.