Ultimately, I went with the same implementation snes9x / Super Sleuth uses: blending the even and odd pixels together and reducing the width back to 256 for pseudo-hires mode. Now, obviously the SNES isn't doing this... so what I'm wondering is if anyone has been able to come up with a better formula for matching the TV output, or if anyone wants to work with me on coming up with a better one than is being used now. I've already read all the old snes9x forum posts on this. The discussion seemed to just die with no real consensus on the best way to do this. Also, does anyone know what the problem with Kirby 3 is? I obviously can't test the game myself to see if I have it right :/
I'm also curious about true hires. I read somewhere where anomie mentioned that hires was the same as pseudo-hires, and the renderer just "skips" every other pixel.
So a tile with pixels 0-7 would actually render 0, 2, 4, and 6 to both the main and subscreens as appropriate.
However, I believe that the SNES renders 0, 2, 4, and 6 to the mainscreen, and 1, 3, 5, and 7 to the subscreen, thus actually truly rendering at 512 pixels, at least internally. Otherwise, blending the pixels back to 256 pixels would just make hires text in games look like ass.
Does anyone know for sure which method the SNES uses here?
Some screenshots for reference:

SD3 when pixels are filtered back down to 256-width. Notice how much more difficult the text is to read... it appears cleaner on an NTSC TV, at least to me... note this is with the video card doing the resizing due to the window size. Doing it in software like pseudo-hires would be slightly more blurry.


With pseudo-hires enabled (left), with pseudo-hires off (right).
That's pretty close to the TV effect (the TV effect is a little brighter)... but again, that emulation will screw up things that actually rely on the increased resolution, such as text and high-resolution textures.

Another pseudo-hires example...