I'm going to try and start posting WIP news in version release threads, instead of the never-ending "bsnes thread".
That said, new WIP posted.
This version adds the offset-per-tile map caching, which gives a ~25% speedup on games that use the effect (CT title, CT black omen, Contra 3 turtle boss, SMAS: SM2 title, etc), but shouldn't affect anything else.
Testers, please play as many games as you can with offset-per-tile effects and look for errors in them. I'm fairly confident the caching works, but regression testing would be good. Don't want another major release with a serious flaw.
I've also added a new compile-time directive, USE_STATE_MACHINE. When defined (it is by default), libco is bypassed for SMP<>DSP sync. It saves 2.048 million co_switch calls a second, pushing the framerate up by ~5% (it will be more significant with slower libco versions). That's actually rather surprisingly small. That means 20m co_switch calls for a PPU would only be a ~50% slowdown. Much lower than my previous calculations.
The good news about the state machine mode is that the actual S-DSP code is exactly the same for both versions -- it's all #define magic. I really don't like this change, because I like the ideal of having the whole system in separate threads, but since it doesn't affect accuracy at all, and speed has taken a beating lately, I'll let it pass. We can always remove the define when the median processor is a Core 2 or better.
Legend:
left = v030, middle = v030 + OPT caching, right = v030 + OPT caching + S-DSP state machine
Code: Select all
117.0 / 117.0 ( 0%) / 123.0 (+ 5%) zelda3
85.0 / 108.0 (+27%) / 112.0 (+31%) ct1
81.0 / 98.0 (+21%) / 103.0 (+27%) ct2
81.5 / ------------ / 90.0 (+20%) mk2
OR 256x256 or 512x512.......
I'm not going to create and destroy textures every time the user changes the video filter or the SNES game toggles hires. Direct3D shouldn't be uploading the entire 1024x1024 texture every single frame. I tell it to lock the video RAM, write the new data in, and unlock it. If D3D is stupid enough to try and cache the whole texture in system RAM to upload, then I don't know what to tell you.
What we need to do is track down
why certain systems are getting this major speed hit, not rig in a fix of cutting down on the texture size.
I'm posting this now in case byuu makes another emergency release:
I'd like to get my latest changes out there, but I think I'll wait a while. Probably starting to annoy the news sites again with such quick releases.
The controller graphic looks great. I'll wait until you finish it (or are you done now?) before merging it in, though. Takes quite a while to encode it on my side.