bsnes v0.034 released
I'd love to see a larger sample size of people's lowest working latencies. So far there is 75 (byuu), 75 (me), and 80 (firebrand). If we don't get anything above 125, I think the range needs changed to 25-125 with a standard 100 interval. It also puts 25 at the low, 75 in the middle and 125 at the end, all multiples of 25. More sensical than 20 at the low, 110 in the middle, and 200 at the end, with a 180 interval.
Diminish, your issues are not to do with the latency since you get crackling regardless. Perhaps the new cpu priority changes are conflicting with your laptop's power savings features.
Diminish, your issues are not to do with the latency since you get crackling regardless. Perhaps the new cpu priority changes are conflicting with your laptop's power savings features.
-
- Regular
- Posts: 347
- Joined: Tue Mar 07, 2006 10:32 am
- Location: The Netherlands
I get more crackling when latency is lower. About power saving features, I have PowerMizer set properly and I verified with CPU-Z that CPU is running to the fullest with bsnes (2000-2200 MHz, there is some 'temporal o/c' function on mobile c2d which has additional multiplier of 11x on my model, standard is ~2000 MHz). Different c2d sleep states which are configurable in RMClock shouldn't be responsible, because they really affect idle. I am not aware of any other power savings on WinXP. My hypothesis is that Realtek simply sucksFitzRoy wrote:Diminish, your issues are not to do with the latency since you get crackling regardless. Perhaps the new cpu priority changes are conflicting with your laptop's power savings features.

Last edited by diminish on Tue Aug 19, 2008 6:29 pm, edited 1 time in total.
-
- Trooper
- Posts: 376
- Joined: Tue Apr 19, 2005 11:08 pm
- Location: DFW area, TX USA
- Contact:
The best way to test is to find a game with a smooth-scrolling horizontal screen. A saved spot in Super Metroid works great, and I often also use the first stage in Super Turrican. I basically kill off any enemies in my way and then run back and forth across the level to watch for jumps
For me, latency controls "crackling" while input frequency controls "jumps"
I wouldn't recommend testing with games like Mario Kart, because its harder to tell if there was a jump. Using a side-scroller reveals jumps more noticeably.
Also byuu, my latency (and my input frequency) values are identical from the previous wip, so I did not gain any latency advantage from this version.
For me, latency controls "crackling" while input frequency controls "jumps"
I wouldn't recommend testing with games like Mario Kart, because its harder to tell if there was a jump. Using a side-scroller reveals jumps more noticeably.
Also byuu, my latency (and my input frequency) values are identical from the previous wip, so I did not gain any latency advantage from this version.
-
- Trooper
- Posts: 394
- Joined: Mon Feb 20, 2006 3:11 am
- Location: Space
I can do -70 in the new terms. Not much different from before, I'm not sure what you're thinking of. I think it was the debate of choosing the max for default so that people wouldn't (a) get crackling and (b) then be faced with two directions and no idea which would improve their crackling. Whereas you were contending that the added dupe frames, which I could barely see, would hurt impressions more.byuu wrote:FitzRoy, what SNES input adjust did you need in the new WIP? I recall you needed a really different number than the rest of us.
I will just update this post with test results as they come:
me: 75ms, -70 frequency
byuu: 70ms, -50 frequency
FirebrandX: 80ms, -130 frequency
Fes: 75ms, -170 frequency
Currently thinking that 125 is too conservative of a max/default. Changing my recommendation to 25-100ms range with 100ms as default.
Last edited by FitzRoy on Wed Aug 20, 2008 5:45 am, edited 4 times in total.
Been away for a few days and I come back to find this awesome breakthrough. :)
Here's my settings using WIP 05 with mario allstars + world:
Output freq: 48000
Input skew: -168
Latency: 75
Works fantastically now, great job!
A quick hunch about lowering latency... This build still uses maximum drift, right? If so, I wonder if tightening that up a little might have an observable impact on minimum latency. Not necessarily cycle-for-cycle sync, but maybe having the sound core catch up at the end of every frame or so. As it stand now though, latency of less than a tenth of a second is already pretty good and barely noticeable, but if there's an easy change that can push it down slightly, it might help, especially for those who can't quite crack the 100ms barrier at present.
EDIT: Another idea to improve sync. It just struck me that now that there's an easily configurable input rate, and the ability to detect both underflows and duped frames, I think the makings for "perfect" sync are already in place. You could have an "adaptive sync" toggle that would wait for duped frames or underflows. Once either happens, the skew is nudged either left or right as appropriate.
The nudging could be fine tuned to be proportional to the amount of time since the last "incident" so that as it converges to an ideal value, the changes are less pronounced. The resulting input skew should then be saved as the user's current setting, so when they start again they retain the benefits of previous runs, but will still dynamically adapt if they got a new monitor or something.
Here's my settings using WIP 05 with mario allstars + world:
Output freq: 48000
Input skew: -168
Latency: 75
Works fantastically now, great job!
A quick hunch about lowering latency... This build still uses maximum drift, right? If so, I wonder if tightening that up a little might have an observable impact on minimum latency. Not necessarily cycle-for-cycle sync, but maybe having the sound core catch up at the end of every frame or so. As it stand now though, latency of less than a tenth of a second is already pretty good and barely noticeable, but if there's an easy change that can push it down slightly, it might help, especially for those who can't quite crack the 100ms barrier at present.
EDIT: Another idea to improve sync. It just struck me that now that there's an easily configurable input rate, and the ability to detect both underflows and duped frames, I think the makings for "perfect" sync are already in place. You could have an "adaptive sync" toggle that would wait for duped frames or underflows. Once either happens, the skew is nudged either left or right as appropriate.
The nudging could be fine tuned to be proportional to the amount of time since the last "incident" so that as it converges to an ideal value, the changes are less pronounced. The resulting input skew should then be saved as the user's current setting, so when they start again they retain the benefits of previous runs, but will still dynamically adapt if they got a new monitor or something.
Yeah, it's from the timing changes in base emulation. There was a flickering line before, now it's mutated into something solid. Read the FAQ concerning bugs in special chip games.
On a side note, when I went back to use v034, vsync was enabled since it shares the cfg, and of course it doesn't work in that version. Kind of annoying for backtesters. Damn cfg separation, I wanna kill it.
On a side note, when I went back to use v034, vsync was enabled since it shares the cfg, and of course it doesn't work in that version. Kind of annoying for backtesters. Damn cfg separation, I wanna kill it.
Be sure to enable both syncs in Emulation Speed in the menu. After that, please give updated results.Rhapsody wrote:I actually still don't know what the input frequency setting does. It defaulted to +175, I set it to 0, spent 20 minutes going sideways as the Princess in SMK, and have no sound problems to report. What am I looking for?
Keep your retro versions of bsnes in separate folders. Now run one, then copy docs&settings\yourname\appdata\.bsnes\bsnes.cfg to your separated bsnes folder. The emulator will now use that config file instead. Do it for both, and no issues with different versions screwing with config file settings.On a side note, when I went back to use v034, vsync was enabled since it shares the cfg, and of course it doesn't work in that version. Kind of annoying for backtesters. Damn cfg separation, I wanna kill it.
Your skew was > 32040 in the last version, and my changes inverted the meaning of that value. I'm not sure how, because the old code was basically black magic; whereas I understand how the new values are derived quite clearly. But it shouldn't be possible to get perfection with the same exact settings :/Also byuu, my latency (and my input frequency) values are identical from the previous wip, so I did not gain any latency advantage from this version.
That is correct, but there are a lot of tough issues with such a setup.It just struck me that now that there's an easily configurable input rate, and the ability to detect both underflows and duped frames, I think the makings for "perfect" sync are already in place.
One, the CPU<>SMP drifting would have to go. 15% speed penalty. With it, the variances would keep making the skew adjust itself, thinking big things are happening. I'll try lowering btw, see how low I can get it without taking more than 1-2% speed loss.
Two, maintaining the value would be hard. What if a virus scanner / background task suddenly spikes the CPU? Vista on its own seems to incur random 1-5 second slowdowns every few minutes. It would throw off the averager big-time. And getting it perfect requires bounces in both directions for a really long time. They may not see perfection for several more seconds / minutes after a CPU spike. Averaging over more time could create similar problems, too.
Three, even if imperceptible, it really does cause pitch changes. The higher sampling rate at a fixed ratio means no signal is lost. If we start adapting it in real-time, the signal will change.
All that said, this is what I initially envisioned, so I will work on this anyway. Just not now, I'm really burned out on all this again. If sinamas and blargg found this to be difficult, I don't have much hope for me, when I could barely get standard sync working. I need to take a break, but I still have to get those mul / div logs first.
-
- Trooper
- Posts: 376
- Joined: Tue Apr 19, 2005 11:08 pm
- Location: DFW area, TX USA
- Contact:
I know my skew was >32040 on the previous version, I was talking about the value being the same amount since all that has changed is reversing the calculation.byuu wrote:
Your skew was > 32040 in the last version, and my changes inverted the meaning of that value. I'm not sure how, because the old code was basically black magic; whereas I understand how the new values are derived quite clearly. But it shouldn't be possible to get perfection with the same exact settings :/Also byuu, my latency (and my input frequency) values are identical from the previous wip, so I did not gain any latency advantage from this version.
At any rate, the same values work best on my system. That being 80ms latency and 130 skew (positive on previous version, negative on current version). I don't know why it stayed the same values, but I've confirmed they are the "sweet spot" for my system.
Here's before and after data for dropping the maximum skew.
Or in English, the skew allowed the sample count to be off by up to 264 samples in either direction at any point in time. The worst case scenario would be the CPU running the full gamut ahead, and then the SMP doing the same in return as it gains control, forcing the DSP to dump 264*2 samples out immediately during the same video frame (there are roughly 533 samples in an emulated frame). Now it can only vary by 25/second in either direction, with a worst case of 50 samples in one frame. The precision increased by over an order of magnitude for a speed penalty of roughly one half of one percent. I suppose that's a decent enough trade-off. The speed hit for lowering the skew is exponential, so we shouldn't push it further.
For those thinking I should just sync the CPU and SMP upon the edge of a video frame, it's not that easy. They run until one is ahead of the other, and only the CPU's enslaved PPU can signal a frame generation. I would have to add a new flag to the SMP to allow it to run until the skew was equal to support this. And trust me, the less checks inside the raw timing control mechanisms, the better. You incur massive speed penalties even from simple boolean variables when it's code that's hit ~10-20 million times a second.
Again, note that this has absolutely nothing to do with accuracy. They sync up whenever they try and talk to each other. This is only when they're off doing their own thing. If they don't look at the other chip, they can't possibly know they aren't running at the same point in time.
Code: Select all
119.5 to 118.5 / Zelda 3
93.5 to 93.0 / Star Ocean
250k * 20m : 5000000000000 : 203196 : 264
20k * 24m : 480000000000 : 19506 : 25
For those thinking I should just sync the CPU and SMP upon the edge of a video frame, it's not that easy. They run until one is ahead of the other, and only the CPU's enslaved PPU can signal a frame generation. I would have to add a new flag to the SMP to allow it to run until the skew was equal to support this. And trust me, the less checks inside the raw timing control mechanisms, the better. You incur massive speed penalties even from simple boolean variables when it's code that's hit ~10-20 million times a second.
Again, note that this has absolutely nothing to do with accuracy. They sync up whenever they try and talk to each other. This is only when they're off doing their own thing. If they don't look at the other chip, they can't possibly know they aren't running at the same point in time.
This will probably be the last public WIP, so get it now if you want it.
I used the same "create a child window inside the output window" trick for Xv that I used for OpenGL, so Xv will now work even with a compositor enabled.
I also added Video::Synchronize support to OpenGL for Windows. My card seems to force it on regardless of my driver settings, but maybe you'll have better luck. That driver had the same issue with allocating 16MB of memory instead of 4MB (that was due to copy and pasting of code), so that's fixed too.
This version lowers the CPU<>SMP drifting by an order of magnitude. You shouldn't notice the speed hit. I can't really get any lower latency with that, though.
I also restricted the latency range to 25 - 175, with the default being in the center, 100ms. Quite conservative, given the average we see is 70-80ms. But you won't notice the difference, and this way we ensure no popping even in exceptional circumstances by default. 25ms is doable without video sync and with OSS4+cooked mode, but I seriously doubt any Windows user will get lower without something crazy going on with the sound card drivers.
Lastly, I've replaced the 2-tap linear resampler with a 4-tap hermite resampler. You won't be able to tell the difference, but it's quite pronounced if you use a waveform analyzer on much higher output frequencies:
Linear:

Hermite:

Hermite is essentially better than cubic (for which cubic spline is an optimized version of), as it is better at not going too far away from the points, so you get a bit less clamping in the extreme cases. But the difference isn't audible to humans anyway. It's still clearly inferior to band-limited interpolation, as it will still have noticeable aliasing of things like square waves and such, but it's orders of magnitude less complex to implement.
Keep in mind that nobody could tell the difference even with linear interpolation from the last few WIPs.
----------
Aside from that, I'm pretty much ready to release a new version. If anyone has any show stoppers, now is the time to say something. Otherwise I'll probably post something tomorrow or Friday.
Code: Select all
http://byuu.cinnamonpirate.com/temp/bsnes_v034_wip06.zip
I also added Video::Synchronize support to OpenGL for Windows. My card seems to force it on regardless of my driver settings, but maybe you'll have better luck. That driver had the same issue with allocating 16MB of memory instead of 4MB (that was due to copy and pasting of code), so that's fixed too.
This version lowers the CPU<>SMP drifting by an order of magnitude. You shouldn't notice the speed hit. I can't really get any lower latency with that, though.
I also restricted the latency range to 25 - 175, with the default being in the center, 100ms. Quite conservative, given the average we see is 70-80ms. But you won't notice the difference, and this way we ensure no popping even in exceptional circumstances by default. 25ms is doable without video sync and with OSS4+cooked mode, but I seriously doubt any Windows user will get lower without something crazy going on with the sound card drivers.
Lastly, I've replaced the 2-tap linear resampler with a 4-tap hermite resampler. You won't be able to tell the difference, but it's quite pronounced if you use a waveform analyzer on much higher output frequencies:
Linear:

Hermite:

Hermite is essentially better than cubic (for which cubic spline is an optimized version of), as it is better at not going too far away from the points, so you get a bit less clamping in the extreme cases. But the difference isn't audible to humans anyway. It's still clearly inferior to band-limited interpolation, as it will still have noticeable aliasing of things like square waves and such, but it's orders of magnitude less complex to implement.
Keep in mind that nobody could tell the difference even with linear interpolation from the last few WIPs.
----------
Aside from that, I'm pretty much ready to release a new version. If anyone has any show stoppers, now is the time to say something. Otherwise I'll probably post something tomorrow or Friday.
-
- Seen it all
- Posts: 2302
- Joined: Mon Jan 03, 2005 5:04 pm
- Location: Germany
- Contact:
Not related to the next release or the near future, but how about these features...
1. AVI support? SNES9x' code seems to be ok, and it'd help with bug reports.
2. A filter that takes the current frame, combines it with the last frame (with adjustable ratio) and displays the result? Would help with flickering effects (player gets hit etc.)
EDIT:
3. Hiding the mouse cursor when it's in the window / the emulation area?
1. AVI support? SNES9x' code seems to be ok, and it'd help with bug reports.
2. A filter that takes the current frame, combines it with the last frame (with adjustable ratio) and displays the result? Would help with flickering effects (player gets hit etc.)
EDIT:
3. Hiding the mouse cursor when it's in the window / the emulation area?
vSNES | Delphi 10 BPLs
bsnes launcher with recent files list
bsnes launcher with recent files list
Something akin to ZSNES and mencoder hooking would be doable. I wouldn't implement anything that is platform-specific, eg DirectShow or somesuch, to record AVIs. And writing my own AVI compressor, yeah, not happening.1. AVI support? SNES9x' code seems to be ok, and it'd help with bug reports.
Dynamic frame selection added to the new version takes care of flickering already. Funny story though, while adding vsync to D3D, I tried to make it perform the Begin+EndScene calls at refresh time, and only call Present when drawing. This created a really neat effect where it'd show one frame, then a frame two before it, then two ahead, then back to normal.2. A filter that takes the current frame, combines it with the last frame (with adjustable ratio) and displays the result? Would help with flickering effects (player gets hit etc.)
Too much hassle to make it an emulator option, but if I did, I'd call it the Michael J Fox filter.
In the future, sure.3. Hiding the mouse cursor when it's in the window / the emulation area?
-
- Regular
- Posts: 347
- Joined: Tue Mar 07, 2006 10:32 am
- Location: The Netherlands
This can be done using shaders. If I get the chance I'll see if I can whip up an example of this (if I remember how it works).. but damn, I have a lot of stuff piling up (holidays make me lazy as crap).creaothceann wrote:2. A filter that takes the current frame, combines it with the last frame (with adjustable ratio) and displays the result? Would help with flickering effects (player gets hit etc.)
-
- Seen it all
- Posts: 2302
- Joined: Mon Jan 03, 2005 5:04 pm
- Location: Germany
- Contact:
I had something in mind that emulates the afterglow (w?) of the TV's phosphor layer.byuu wrote:Dynamic frame selection added to the new version takes care of flickering already.
vSNES | Delphi 10 BPLs
bsnes launcher with recent files list
bsnes launcher with recent files list
Why not let Zsnes team do it for you? They done a good job putting one on Zsnes.byuu wrote:Something akin to ZSNES and mencoder hooking would be doable. I wouldn't implement anything that is platform-specific, eg DirectShow or somesuch, to record AVIs. And writing my own AVI compressor, yeah, not happening.
Window Vista Home Premium 32-bit / Intel Core 2 Quad Q6600 2.40Ghz / 3.00 GB RAM / Nvidia GeForce 8500 GT
Inconsistency:
-Video Driver, Audio Driver, Input Driver = Video driver, Audio driver, Input driver
Audio section:
-add 192000. Might as well if 96000 gets in there, and you will never get any frequency requests again
-"PC" and "SNES" are not necessary additions, they're both technically happening on the PC.
-results show that nobody has come close to needing a positive Input frequency adjust. Why not make the range -200 to 0?
-put a note below the last slider that says this:
Lastly, let's create a hypothetical scenario using current defaults. A user turns on sync to video and gets audio crackle. We know 100ms for latency won't be the cause of that, it will likely be the -50 input setting. He doesn't know that. Since latency is capable of being set higher and is a more familiar term, any rational person, even myself, would first assume that perhaps latency at 100ms is too aggressive and test it higher. And higher. And higher. This is all a waste of time, and not something likely to get reported even though we know it's probably going to happen. Meanwhile, present indications are that allowing 100-175ms helps perhaps no one. Of the costs and benefits incurred, the costs win out. The note helps it, but it doesn't eliminate it. And no, this doesn't mean bringing latency into the note is a better solution. This one unfamiliar variable responsible for all default crackling should be the only thing they're overtly directed towards.
This will be my last explanation of the matter.
EDIT: clarified note
-Video Driver, Audio Driver, Input Driver = Video driver, Audio driver, Input driver
Audio section:
-add 192000. Might as well if 96000 gets in there, and you will never get any frequency requests again
-"PC" and "SNES" are not necessary additions, they're both technically happening on the PC.
-results show that nobody has come close to needing a positive Input frequency adjust. Why not make the range -200 to 0?
-put a note below the last slider that says this:
Code: Select all
Note: When emulation speed is synced to both audio and video, a lower input frequency decreases the chance of audio crackle, but increases the chance of duplicate frames.
This will be my last explanation of the matter.
EDIT: clarified note