Try my Java SNES emulator? :)

spiller · Post by **spiller** » Mon Jul 13, 2009 7:32 am

A warm hello to everyone on the ZSNES boards!

A few months ago, I turned up here asking about the SNES CPU, saying I had begun a pet project to write a Java SNES emulator. I was pleasantly surprised to find the response encouraging.

When I started it I knew nothing about the SNES internals or Java, but had multi-packs of enthusiasm for my favourite games console. Some weeks later, I knew far too much about the SNES, hated Java, and my enthusiasm had run dry.

The project has been completely stagnant for seven weeks. I was defeated after a long and terrible battle against the PPU to get the transparency effects working. Not since the host of Valinor assailed Morgoth in Angbad had so much blood been spilled on Middle-earth.

But I decided if Vampent can release a buggy emulator lacking in important features -- http://www.vampent.com/vsun.htm -- then why can't I?

So I've packaged the thing up with no modification. I'd really like people to try it. I'd appreciate all feedback and criticism. This is still a project I want to finish, if only I can get motivated again.

Here it is: jsnes-0.01-alpha.jar
*Edit*: Newer improved, version: jsnes-0.03.2-alpha.jar (478 kB)

Some things you should know:
- This particular release runs as a standalone Java application, not as a web applet.
- It's still quite primitive. E.g., no sound.
- The keys are hard-coded and not configurable (sorry!). Arrows control the directional pad. Number pad 2, 4, 6, 8 (cross shape), and also, Z, A, X, S, control the corresspondingly positioned keys on the right hand side of the SNES controller. Number pad / and * are L and R.
- The debugger's interface is buggy.
- Since only a couple of friends have tested it, it's not very polished yet. It's still set up mainly for development.
- Bicubic filtering seems to take an absurd 100 milliseconds per frame!(?) Dammit Java.
- Probably other stuff I can't remember right now.

Every little bit of feedback is extremely welcome.

Here are some quick screenies to tempt you:

gllt · Post by **gllt** » Mon Jul 13, 2009 7:43 am

this looks awesome, let me play with it in a bit and I will come and feedback

Post by **grinvader** » Mon Jul 13, 2009 9:56 am

That's pretty badass to have come this far in a few weeks already, you know.

Post by **Nach** » Mon Jul 13, 2009 11:07 am

spiller wrote:A warm hello to everyone on the ZSNES boards!

And an ice cold hello to you on this boiling hot day. (warm hello... like we need other hot things here)

And what the heck do you call this?

spiller wrote:

Why isn't Link bursting out of the screen? If you can't figure out how to do it, at least give us the tools to add our own textures to the game.

This emulator rates -15 out of 12 on the reverse metric scale of economics. The marketplace will determine whether they want an emulator which lacks cell shading, wireframe support, and double burst rate equalizer optimizations.

AamirM · Post by **AamirM** » Mon Jul 13, 2009 11:50 am

Well, gave it a try. Either Java is faster than I thought, or you coded that emulator quite excellently =P. Runs full speed for me.
Al Unser Jr's Road to the Top" is quite bugged and unplayable.

spiller · Post by **spiller** » Mon Jul 13, 2009 12:30 pm

Thanks all.

Nach that was... entertaining? Do you want him anaglyphic?

Isn't he Zelda btw? The game's called Zelda? I suck at this.

Aamir that 'Road to The Top' game uses graphics mode 7 for the track. Top Gear suffers from the same problem. The mode 7 maths looked too daunting and I'm already stuck for choice between four 1/2 different versions of the plain renderer, so I haven't got around to mode 7 yet. It's one of many PPU features I've no idea how to implement.

creaothceann · Post by **creaothceann** » Mon Jul 13, 2009 1:00 pm

http://en.wikipedia.org/wiki/The_Legend ... o_the_Past

For some reason the emulator doesn't seem to accept any keys.

For transparency & BG modes see anomie's docs.

adventure_of_link · Post by **adventure_of_link** » Mon Jul 13, 2009 1:30 pm

spiller wrote:Isn't he Zelda btw? The game's called Zelda? I suck at this.

No, that would be Link, the guy you control throughout the whole Zelda game. The princess is Zelda. I know, it's retarded, but that's the way it is.

Either way, I'll have to give this jSNES a try sometime.

nintendo_nerd · Post by **nintendo_nerd** » Mon Jul 13, 2009 3:48 pm

Very impressive indeed, Super Mario World/Zelda 3 is running 60 frames per second!

Post by **lordmissus** » Mon Jul 13, 2009 5:51 pm

I'm not surprised that this runs at fullspeed though, despite being a java app (even if he did code it particularly well). He hasn't implemented sound, and as only a very "primitive" emulator as spiller calls it I'm guessing he hasn't implemented many of the more low-level communications between various hardware components in the SNES. Really impressive that you knew nothing about Java or SNES internals and then produce this weeks later though.

I'll be interested to see how this project turns out over time.

Post by **Nach** » Mon Jul 13, 2009 6:39 pm

spiller wrote: Nach that was... entertaining? Do you want him anaglyphic?

No, I just want the esteemed spiller joining our ranks to get used to such requests, since they'll come up again and again. At least now each time it does, he can remember his first one where it wasn't actually meant, came from a fellow developer trying to be funny, and laugh about it, instead of getting disheartened with stupid requests.

Also while we're at it, it seems you lack 4096MHz sound output, so I can't hear anything.

davideo7 · Post by **davideo7** » Mon Jul 13, 2009 6:49 pm

I'm very impressed, out of the 5 games I tested, it run incredibly smooth.

Games I tested were:
Diddy's Kong Quest
Goof Troop
Super Mario World
Virtual Bart
Chrono Trigger

Did notice a few little minor graphic issues and layer issues with CT, DKQ and VB but they weren't big enough to make the games unplayble.

This is a huge step though for a Java SNES Emulator.

Any plans on when Saving/Loading will work?

juef · Post by **juef** » Mon Jul 13, 2009 7:00 pm

Wow, I'm really impressed - even if you did this in a short period of time, you must have put quite some work on this. Any plan on making the source public?

creaothceann wrote:For some reason the emulator doesn't seem to accept any keys.

Same here.

I haven't tried many games due to above fact, but most of the non-special-chipped ones look pretty good. Keep up the good work!

snkcube · Post by **snkcube** » Mon Jul 13, 2009 8:28 pm

I tried out U.N. Squadron and the game runs at full speed. The only issue I see is weird looking lines appearing in the first stage of the game. Other than that, great job with the emulator so far.

AamirM · Post by **AamirM** » Mon Jul 13, 2009 8:29 pm

For some reason the emulator doesn't seem to accept any keys.

This happened to me after I loaded the second game.

spiller · Post by **spiller** » Mon Jul 13, 2009 8:42 pm

lordmister wrote:Really impressive that you knew nothing about Java or SNES internals and then produce this weeks later though.

I owe it to the creators of the very fine documentation now available. They did the real work. (I'm not just sucking up to these boards -- I mean it! I couldn't have started this without the docs.)

Nach wrote:No, I just want the esteemed spiller joining our ranks to get used to such requests, since they'll come up again and again. At least now each time it does, he can remember his first one where it wasn't actually meant, came from a fellow developer trying to be funny, and laugh about it, instead of getting disheartened with stupid requests.

Also while we're at it, it seems you lack 4096MHz sound output, so I can't hear anything.

Thank you.

I've added that to my todo list.

creaothceann wrote:For some reason the emulator doesn't seem to accept any keys.

I'm so sorry!.. I wonder if this is a focus issue. Try clicking on the center of the window. I've added investigating this to my todo list.

creaothceann wrote:For transparency & BG modes see anomie's docs.

I've been lurking for a while and found Anomie's docs long ago. Almost my entire PPU is based on the recipes in them. They're much clearer than anything else.

I understand how most of it is supposed to work -- I just can't come up with a way to do it in Java. It's just too slow because array accesses are bounds-checked and it doesn't support structs. I remember looking at other emulator sources for inspiration (never to copy!) and found the following:
- ZSNES and Snes9x I couldn't understand and/or find the relevant code.
- bsnes's code was doing per-pixel z-order checks. Meaning no offense at all to byuu, your emulator source is wonderfully readable, but bsnes doesn't run full speed on my computer, so if I'm to make an emulator in Java, it has to have a much more efficient (or less accurate) PPU than yours.

- creaothceann's vSNES seems to do everything per-pixel. I don't think that would be fast enough for rendering in real time. I could be mistaken on how it works since I've never done Delphi.

davideo7 wrote:Games I tested were:
Diddy's Kong Quest
Goof Troop (*)
Super Mario World
Virtual Bart (*)
Chrono Trigger (*)

I've never tested three of those (indicated with *), so if they work at all, that's good news!

davideo7 wrote:Did notice a few little minor graphic issues and layer issues with CT, DKQ and VB

Yes, most issues in games that basically work are graphics related.

And, argh! The layering issue! This is something I battled with for ages. I still don't get it! Can anyone enlighten me a little?

This sounds stupid but I'm not even sure how many sprite layers there are supposed to be. Anomie's doc says there is just one but most other stuff I've read says there are four. I tried every combination of logic I could think of with the sprite priority rotation stuff and yet, by far, the method that worked the best and had the least stuff blinking out behind the backgrounds was to simply draw the sprites on the very top and ignore the sprite priorities entirely, so that's what it does.

It becomes obvious in Donkey Kong Country, e.g., when they go through the exit cave:

davideo7 wrote:Any plans on when Saving/Loading will work?

On my todo list.

juef wrote:Any plan on making the source public?

I hadn't given it much thought. I might make it public, though not yet. I'd like to keep the program under finer control at the moment while it's still in a state of flux and is lacking in documentation somewhat. I'm more than happy to share parts of code if anyone wants to see how things work.

creaothceann · Post by **creaothceann** » Mon Jul 13, 2009 10:16 pm

spiller wrote:
creaothceann wrote:For some reason the emulator doesn't seem to accept any keys.
I'm so sorry!.. I wonder if this is a focus issue. Try clicking on the center of the window. I've added investigating this to my todo list.

Now it works, thanks.

I was double-clicking the *.jar file, then minimizing the previously active window (Total Commander), then dragging jsnes at its title bar to the center of the screen.

spiller wrote:
creaothceann wrote:For transparency & BG modes see anomie's docs.
I've been lurking for a while and found Anomie's docs long ago. Almost my entire PPU is based on the recipes in them. They're much clearer than anything else.

I understand how most of it is supposed to work -- I just can't come up with a way to do it in Java. It's just too slow because array accesses are bounds-checked and it doesn't support structs.

Well, in vSNES the arrays are bounds-checked too, because I saw almost no speed difference between checking and not checking.

How are you doing it currently, if not via arrays? And you can probably still use classes instead of records.

spiller wrote:I remember looking at other emulator sources for inspiration (never to copy!) and found the following:
- ZSNES and Snes9x I couldn't understand and/or find the relevant code.
- bsnes's code was doing per-pixel z-order checks. Meaning no offense at all to byuu, your emulator source is wonderfully readable, but bsnes doesn't run full speed on my computer, so if I'm to make an emulator in Java, it has to have a much more efficient (or less accurate) PPU than yours.
- creaothceann's vSNES seems to do everything per-pixel. I don't think that would be fast enough for rendering in real time. I could be mistaken on how it works since I've never done Delphi.

Part of bsnes' slowdown comes from emulating the system at a fine resolution (cycle- instead of opcode-level).

You pretty much have no other choice than determining the layers pixel-by-pixel (because of transparent pixels and because SNES games can in some cases change the PPU while it draws a line). And the fastest way to do that is top-down, i.e. skipping the rest of the inner loop when you get a solid pixel.

spiller wrote:
davideo7 wrote:Did notice a few little minor graphic issues and layer issues with CT, DKQ and VB
Yes, most issues in games that basically work are graphics related.

And, argh! The layering issue! This is something I battled with for ages. I still don't get it! Can anyone enlighten me a little?

This sounds stupid but I'm not even sure how many sprite layers there are supposed to be. Anomie's doc says there is just one but most other stuff I've read says there are four. I tried every combination of logic I could think of with the sprite priority rotation stuff and yet, by far, the method that worked the best and had the least stuff blinking out behind the backgrounds was to simply draw the sprites on the very top and ignore the sprite priorities entirely, so that's what it does.

See anomie's regs.txt, section "Sprite Priority".
Go through all pixels of the screen. For each pixel, the first non-transparent sprite with the lowest index (starting at [FirstSprite] and incrementing up to [(FirstSprite+127) AND 127]) at that position is used, even if sprites with a higher priority are behind it. The priority (0..3) of this sprite decides if it's behind the first visible background layer or not. The screen layer order is determined by BGMode, see section "2105" or section "Mode 0" and beyond in anomie's doc.

For Mode7 you'll need to do things pixel-by-pixel too; there's a formula in the file for that.

byuu · Post by **byuu** » Mon Jul 13, 2009 11:09 pm

This looks really good, keep up the good work! And don't get too discouraged by the million things people will request. Just focus on what you want to do, and work at it slowly and steadily, and you'll get there before you know it.

Also note that no matter what you do add, it'll never be good enough in some peoples' eyes. Just remember that they aren't paying you anything, so don't let them get to you ;)

The super majority of us are very appreciative of your hard work, even if you don't hear it as often.

- bsnes's code was doing per-pixel z-order checks. Meaning no offense at all to byuu, your emulator source is wonderfully readable, but bsnes doesn't run full speed on my computer, so if I'm to make an emulator in Java, it has to have a much more efficient (or less accurate) PPU than yours.

There's nothing accurate about my PPU, either, except that it supports pretty much every effect (including hi-res math, pseudo hi-res mode, Mode7 EXTBG, mosaic Vcounters, and EXTBG mosaic+OAM size hardware glitches.)
It still renders an entire scanline at a time.

As for why I used a Z-depth check, it's because it's nearly required for many things.

For instance, color add/sub. The priorities come into play here. You could overwrite the previous value if you render in the exact order hardware does, but if you do that, you will have to split each BG and OAM priority pass apart, or interleave them all together into a gigantic function (which will no doubt just end up calling per-pixel variants for each BG). Right now I do all priority passes in one render_line_(bg/oam) call.
Remember: different tiles on the same line and the same BG can be different priorities, but different pixels on the same line and BG cannot be.

The Z-order check isn't the bottleneck, it only gets called when pixels overlap, which isn't all that often. 40% of my emulator time is spent inside bPPU::render_line_bg(), and that function has really heavy tile fetch+decode caching.

And thanks for saying that about the code. I try as hard as I can there, and was quite surprised when someone was complaining about it the other day. Figured it was probably just him. But if you have any critiques*, please do share.
(* and note that the PPU in particular is the oldest code left by many years. It's the worst thing in there now.)

spiller · Post by **spiller** » Tue Jul 14, 2009 1:01 am

byuu wrote:thanks for saying that about the code. I try as hard as I can there, and was quite surprised when someone was complaining about it the other day.

There are parts of it I couldn't follow, or maybe I just didn't try very hard. But it's by far the most readable SNES emu code I've seen. I've often had your source folder open as a reference. I always made a point not to copy anyone's methods or code, but it's been great for understanding how the SNES is supposed to work.

byuu wrote:But if you have any critiques*, please do share.

Well there was one thing that stood out. I hope I've not misunderstood, but bPPU::build_sprite_list(), which converts the sprites from their OAM representation, probably takes quite a bit of time. This is being called by render_line_oam_rto(), which is being called by render_scanline(). In other words, you're rebuilding all 128 sprites 224 times per frame. I just wondered why. I only rebuild them if OAM was modified (it tests if writes to $2104 actually changed anything) since the last time, which means it often rebuilds the sprites less than once per frame.

creaothceann wrote:You pretty much have no other choice than determining the layers pixel-by-pixel (because of transparent pixels and because SNES games can in some cases change the PPU while it draws a line). And the fastest way to do that is top-down, i.e. skipping the rest of the inner loop when you get a solid pixel.

Really?

It seems that we adopted radically different approaches to this. Maybe this is why I can't figure out anyone else's code.

For example, you have a GetPixel function that does a for loop over the background and sprite layers. Then GetPixel_BG offsets by the background position, tests the background mode, calculates the offset to and reads the tilemap word, and finally tests the tile priority before it writes the pixel. I can't figure out how this handles transparent pixels and subscreens and layering, but I can see there's a huge amount of work that is done per pixel.

This is almost the exact reverse of what I've done. Let me explain so people can see why I'm having such extraordinary trouble with this. I have a scanline buffer made of plain pixels (no depth information). It's magically filled with the backdrop. Then for the up to 2x4 background layers it renders them from back to front into the buffer. Offsets are calculated once in advance and held in variables. The tilemap word is read only twice per background. The tile scanline renderer tests the priority on and blits all eight pixels at once, which are packed into two 32-bit ints in the tile cache (to reduce array bounds checks in exchange for bit shifting and masking). (Does anybody follow what I mean, because I'm not sure I do.) Extremely little is tested per pixel, with as much as possible calculated in advance and done in large groups. This is how I managed to get screen generation down to 1 to 2 milliseconds per frame (to compensate for Java's very slow blitting of the frames).

+EDIT: Here's the overview of it: http://pastebin.com/m138422fe

But I don't know how I'm going to do windows, offset per tile mode, mosaic, sub-screens, etc. I almost wanted to delete the source because it made me so angry, discovering more and more and more impossibly complicated features on a 19 year-old chip. It's why I gave up before, and I still have no inspiration.

I also still don't get the sprites. I've tried that before. I've read Anomie's explanation a hundred times. It's impossible to get a clear screenshot, but what happens is the sprites are mostly stuck behind the backgrounds, but often only for some scanlines, so it makes a flickering liney effect.

creaothceann · Post by **creaothceann** » Tue Jul 14, 2009 1:51 am

spiller wrote:
creaothceann wrote:You pretty much have no other choice than determining the layers pixel-by-pixel (because of transparent pixels and because SNES games can in some cases change the PPU while it draws a line). And the fastest way to do that is top-down, i.e. skipping the rest of the inner loop when you get a solid pixel.
Really? It seems that we adopted radically different approaches to this. Maybe this is why I can't figure out anyone else's code.

For example, you have a GetPixel function that does a for loop over the background and sprite layers. Then GetPixel_BG offsets by the background position, tests the background mode, calculates the offset to and reads the tilemap word, and finally tests the tile priority before it writes the pixel. I can't figure out how this handles transparent pixels and subscreens and layering, but I can see there's a huge amount of work that is done per pixel.

GetPixel_BG returns false if the background is transparent at the current position. So when the background is clipped or the tile has the wrong priority or the pixel is transparent (see the tmp_PalIndex check), the function exits.
Yeah, it could've used a comment or two - but most of the functions require "inside knowledge" like that. It's even not very efficient - I go through the backgrounds twice, once for each priority.
Subscreens and color math is done by the large "Update_" functions (Update_HiRes, Update_LoRes512, Update_LoRes256). They're slow too, because they generate pictures for an 8-bit screen and an 16-bit screen at the same time (for the GUI), and both screens can use mainscreen and subscreen pixels.

spiller wrote:I have a scanline buffer made of plain pixels (no depth information). It's magically filled with the backdrop. Then for the up to 2x4 background layers it renders them from back to front into the buffer. Offsets are calculated once in advance and held in variables. The tilemap word is read only twice per background. The tile scanline renderer tests the priority on and blits all eight pixels at once, which are packed into two 32-bit ints in the tile cache (to reduce array bounds checks in exchange for bit shifting and masking). (Does anybody follow what I mean, because I'm not sure I do.) Extremely little is tested per pixel, with as much as possible calculated in advance and done in large groups. This is how I managed to get screen generation down to 1 to 2 milliseconds per frame (to compensate for Java's very slow blitting of the frames).

+EDIT: Here's the overview of it: http://pastebin.com/m138422fe

But I don't know how I'm going to do windows, offset per tile mode, mosaic, sub-screens, etc. I almost wanted to delete the source because it made me so angry, discovering more and more and more impossibly complicated features on a 19 year-old chip.

I also started out generating the backgrounds and sprite layers first - i.e. bottom-up. (You can still download the vSNES version that did this, because it enabled a cool feature - repeating smaller backgrounds to generate a really large screen. That was requested by a user for Gunman's Proof.)

In the end though I gave up on that, because that's not how the hardware works. Actually all these concepts of "backgrounds" and other high-level things are just theoretical stuff. The PPU just uses a bunch of registers to get the offsets of some bytes in some buffers according to some quirky rules, and generates a pixel from that. Rinse, repeat for the next pixel.

Take the Mode7 transformation, for example. It starts at a given screen position (x|y), and uses the Mode7 registers to "bend" this position onto a specific VRAM pixel. And that's how the other background modes work too, especially the Offset-Per-Tile transformation.

Post by **badinsults** » Tue Jul 14, 2009 3:29 am

I just tried it out on some games I recently dumped. My Pal games had many graphical glitches. I assume that PAL mode is not supported yet?

byuu · Post by **byuu** » Tue Jul 14, 2009 7:14 am

I hope I've not misunderstood, but bPPU::build_sprite_list(), which converts the sprites from their OAM representation, probably takes quite a bit of time.

It takes about ~8% of emulation time in total. I optimized it to insane degrees with the Mednafen author, and it didn't help performance any, but I was still computing it once every scanline.

You're right that a $2104 dirty flag would be a good idea in this case. And to that aim, probably for window mask tables as well. I'll give it a try, thank you.

Then for the up to 2x4 background layers it renders them from back to front into the buffer.

Isn't it expensive to run over the entire scanline twice for each background, and four times for each OAM priority? I figured that would be worse than an if(z > pixel[x].z) { pixel[x].pal = p; pixel[x].z = z; } only when writing.

Extremely little is tested per pixel, with as much as possible calculated in advance and done in large groups.

Definitely not going for a cycle-renderer, I see ;)

This is how I managed to get screen generation down to 1 to 2 milliseconds per frame

That's really exceptional, speed wise. I hope you can find a solution with your approach.

But I don't know how I'm going to do windows, offset per tile mode, mosaic, sub-screens, etc.

Little by little. I'm not one to talk about speed, but for me it worked a hell of a lot better to not focus on speed at all on the initial implementation, and then go back and add things like the tiledata decode cache, tilemap cache fetching, RTO cache fetching, etc.

Trust me, those speed hacks based on faulty assumptions will bite you in the ass when you go back to add things like mosaic and RTO.

I almost wanted to delete the source because it made me so angry, discovering more and more and more impossibly complicated features on a 19 year-old chip. It's why I gave up before, and I still have no inspiration.

I know the feeling. About the only advice I have is that the more difficult things get, the more it makes figuring it out in the end that much more rewarding. This is probably the wrong thing to say, but I've found the S-CPU to be just as much of a nightmare when you're trying to get 100% compatibility. The closer you get to perfection, the more random games break. And not the good games you love, but the absolute shit games like Jumbo Ozaki no Hole in One, Super Power Rangers Battle Whateverthefuck, Toy Story, Bugs Bunny, etc. The PPU is extremely forgiving in regards to timing compared to the other chips.

Of course, if you're happy with 99% compatibility sans those special chips, and maybe a few small per-game tweaks for 100%, you're really most of the way there already :)

I do hope that you don't give up, but I won't be selfish enough to say to work on it no matter what. If you don't have the motivation, that's a real problem. It may be best to put it aside for a little while and see how you feel then. Just don't be rash and delete the source. And keep a backup somewhere just in case ;)

I also still don't get the sprites. I've tried that before. I've read Anomie's explanation a hundred times.

I still don't get his FirstSprite+Y notes, but I've never seen a game use it, and in fact even my tests to trigger it haven't worked. I always see the same results in my emulator.

To me the hardest part to understand was the way the final pixel is generated once you know what each BG and OAM pixel should be. The main+sub, window+color-window, priorities, color math both on fixed colors and other layers, special rules for the background, etc etc were so complicated that I literally only solved it by trial and error over several weeks.

It's really tough to explain this stuff in laymen's terms. There's so many millions of caveats and edge cases that you can't really explain anything 100% without spanning several pages or relying on the readers' knowledge :/

Which leads us to the problem with source code comments:

Yeah, it could've used a comment or two - but most of the functions require "inside knowledge" like that.

I can never think of a comment that won't be either so terse as to be pointless, or so verbose that the comments will be 5-10x larger than the code, making the functions as a whole very difficult to read.

spiller · Post by **spiller** » Tue Jul 14, 2009 8:04 am

badinsults wrote:I just tried it out on some games I recently dumped. My Pal games had many graphical glitches. I assume that PAL mode is not supported yet?

PAL games are fully supported (as much as NTSC, anyway). Many games have graphical glitches, period, due to PPU things that aren't implemented and PPU things that are broken.

But if any game is broken and it's not graphics-related and it's not a special chip one, I'd like to know about it, no matter how minor the glitch.

creaothceann wrote:Actually all these concepts of "backgrounds" and other high-level things are just theoretical stuff. The PPU just uses a bunch of registers to get the offsets of some bytes in some buffers according to some quirky rules, and generates a pixel from that. Rinse, repeat for the next pixel.

Take the Mode7 transformation, for example. It starts at a given screen position (x|y), and uses the Mode7 registers to "bend" this position onto a specific VRAM pixel. And that's how the other background modes work too, especially the Offset-Per-Tile transformation.

I really liked this interpretation. I'm adding that to my list of notes.

byuu wrote:Isn't it expensive to run over the entire scanline twice for each background, and four times for each OAM priority? I figured that would be worse than an if(z > pixel[x].z) { pixel[x].pal = p; pixel[x].z = z; } only when writing.

Good point. I fretted about merging the scanline backdrop filling with the copying from the scanline buffer into the screen buffer to save a run over the buffer, yet didn't stop to think that I was running over it eight times for the backgrounds and once more for the sprites (drawn on top). At least it's only 1024 bytes and can be held in even the tiniest cache.

But this and creaothceann's success with a layering per pixel top-down approach instead of a layering per scanline bottom-up one have left me curious to try it. Of course, it means abandoning hundreds of lines of incredibly finely tuned blitting code that took days to optimize, but oh well. I just can't see a final solution with the current approach.

"Four times for each OAM priority"???! Does this mean that [my interpretation of] Anomie's explanation is wrong and there really are four sprite layers and not one like it says?

ARGH!

byuu wrote:I've found the S-CPU to be just as much of a nightmare when you're trying to get 100% compatibility.

Happily I haven't encountered this problem much yet. I've not tested many games but all the ones I have tested have been completely forgiving of my borky timings. Problems are mainly graphics related. The only MAJOR timing problem at the moment is that I can't always handle an interrupt scheduled for the first 60 or so clocks of a frame. Interrupts scheduled for clock 0 never fire. That messes up a couple of "absolute shit games".

byuu wrote:It may be best to put it aside for a little while and see how you feel then.

It wouldn't be! I have more than a hundred unfinished projects collected over years. I put so much work into my operating system. That had its crippling battle too -- trying to understand how to set up the virtual page tables when they're stored in memory that's being translated by the virtual page tables. It was much more complicated than that, but anyway. Point is it's been completely untouched for a year since then. Putting things aside won't help [me].

Thanks once again for a long and detailed reply byuu. I'm not looking forward to going to war against the PPU again, but I have no choice so I'll do it anyway.

Post by **grinvader** » Tue Jul 14, 2009 9:22 am

creaothceann wrote:Part of bsnes' slowdown comes from emulating the system at a fine resolution (cycle- instead of opcode-level).

The s-cpu part is WAY sub-cycle.

spiller · Post by **spiller** » Tue Jul 14, 2009 9:25 am

I've confirmed that a per pixel top-down style render is out of the question. It wasn't even a competition. Even not properly working and not supporting modes apart from 0, or background tile priorities, or 16-pixel tiles, or more than 2 layers, or sprites, or any other logic at all, it was just appallingly slow. Given the extraordinary amount of per pixel work that was required I'm not remotely surprised. But now I have no ideas left.

Try my Java SNES emulator? :)

Try my Java SNES emulator? :)

Re: Try my Java SNES emulator? :)