Code: Select all
<grinvader> Nach: do we have a model (picture/text, whatever as long as it describes properly) of the 5a22's 2-stage pipeline ?
<grinvader> or will i have to torture byuu for it ?
<grinvader> >:3
Moderator: ZSNES Mods
Code: Select all
<grinvader> Nach: do we have a model (picture/text, whatever as long as it describes properly) of the 5a22's 2-stage pipeline ?
<grinvader> or will i have to torture byuu for it ?
<grinvader> >:3
Code: Select all
<jmr> bsnes has the most accurate wiki page but it takes forever to load (or something)
Code: Select all
lda #$1234; cli
----------
1-stage approach w/ 2-stage simulation, by testing IRQ trigger one cycle early:
cycle X+0: fetch opcode $a9
cycle X+1: fetch operand lo $34
cycle X+2: perform IRQ trigger test; then fetch operand hi $12; then A = #$1234
cycle Y+0: if IRQ test passed, perform IRQ; otherwise fetch opcode $58
cycle Y+1: perform IRQ trigger test; then I = 0
cycle Z+0: if IRQ test passed, perform IRQ; otherwise fetch next opcode
----------
2-stage pipeline:
Note: work/bus cycle ordering does not matter; they happen at the exact same time
work cycle W+?: complete last cycle of previous opcode
bus cycle X+0: if(IRQ) fetch irq vector lo; else fetch next opcode $a9
work cycle X+0: idle
bus cycle X+1: fetch operand lo $34 to MDR
work cycle X+1: A.lo = MDR ($34)
bus cycle X+2: fetch operand hi $12 to MDR + test for IRQ
work cycle X+2: A.hi = MDR ($12)
bus cycle Y+0: if(IRQ) fetch irq vector lo; else fetch next opcode $58
work cycle Y+0: idle
bus cycle Y+1: test for IRQ
work cycle Y+1: I = 0
bus cycle Z+0: if(IRQ) fetch irq vector lo; else fetch next opcode
Your descriptions are good documentation. Thanks muchly.byuu wrote:I haven't written up any documentation on it, sadly.
Code: Select all
<jmr> bsnes has the most accurate wiki page but it takes forever to load (or something)
I thought pagefault had already written the new cycle-level S-CPU core ... are you improving it, or starting on a different one? Just curious.Your descriptions are good documentation. Thanks muchly.
Not like all code has to be clear. I already break it down in more of a mark-up language. If people want a reference, they can use that.Yea... Sure, it's not clear, but that's not my goal. I might have a decent overall solution.
That sounds like a good candidate for the good ol'byuu wrote: I'm very interested in your ideas. I've never come up with a good solution, other than sticking last_cycle() calls all over my CPU core.
Code: Select all
#define } last_cycle(); }
Seems I wasn't clear. last_cycle() has to go before the last work cycle. Otherwise I'd just stick the call immediately after the opcode invocation. Take the worst case example where the last cycle changes based on the register size setting:That sounds like a good candidate for the good ol' #define } last_cycle(); }
Code: Select all
case 0xa9: {
if(regs.p.m) last_cycle();
rd.l = op_readpc();
if(regs.p.m) { op_lda_b(); break; } //flag calculation, end opcode
last_cycle();
rd.h = op_readpc();
op_lda_w(); //flag calculation
} break;
Code: Select all
case 0x58: {
last_cycle(); //this looks at the state of regs.p.i ...
op_io_irq(); //I/O cycle that becomes a read cycle if IRQ triggers*
regs.p.i = 0; //... which is cleared *after* the check
} break;
Code: Select all
//immediate, 2-cycle opcodes with I/O cycle will become bus read
//when an IRQ is to be triggered immediately after opcode completion
//this affects the following opcodes:
// clc, cld, cli, clv, sec, sed, sei,
// tax, tay, txa, txy, tya, tyx,
// tcd, tcs, tdc, tsc, tsx, tcs,
// inc, inx, iny, dec, dex, dey,
// asl, lsr, rol, ror, nop, xce.
alwaysinline void sCPU::op_io_irq() {
if(event.irq) {
//IRQ pending, modify I/O cycle to bus read cycle, do not increment PC
op_read(regs.pc.d);
} else {
op_io();
}
}
You wouldn't happen to have measured these in ppu ticks or some other workable time unit, would you ? ^^byuu wrote:It gets even more fun if you want to support the bus hold delays that can be observed from the S-PPU and S-SMP. Need to split every read access into two state table entries each ;)
Different. We all have our little favourite quirks and goals, after all. Oh, I did omit to mention this is not for ZSNES at all, which could lead to confusion.I thought pagefault had already written the new cycle-level S-CPU core ... are you improving it, or starting on a different one? Just curious.
Well actually doing the 2-stage pipeline helps tremendously there, hehe.I'm very interested in your ideas. I've never come up with a good solution, other than sticking last_cycle() calls all over my CPU core. Please let me know once you have something written up.
blargh, i hate that sort of stuff.Nach wrote:That sounds like a good candidate for the good ol'Of course probably have to make that work with other logic nests being used...Code: Select all
#define } last_cycle(); }
Code: Select all
<jmr> bsnes has the most accurate wiki page but it takes forever to load (or something)
No, but the point is, you can create a different little language in between which can simplify things.grinvader wrote:blargh, i hate that sort of stuff.Nach wrote:That sounds like a good candidate for the good ol'Of course probably have to make that work with other logic nests being used...Code: Select all
#define } last_cycle(); }
Don't use a language's power to maim its own syntax (i'm looking at you, operator overloaders) or hide code, please.
Code: Select all
#define START_OPCODE { opcode_init();
#define END_OPCODE opcode_cleanup(); }
#define IF(x) if (x) { reinit_flags();
#define ELSE } else { save_flags();
#define ENDIF }
Code: Select all
START_OPCODE
IF (ready())
begin_countdown();
ELSE
delay();
ENDIF
proccess_current_state();
END_OPCODE
Or you can spend time doing awesome instead of writing new languagesNach wrote:If you have a situation where you need the same operations done at the exact same state in each function, you can clean up the base logic with an intermediary language, and focus on where the functions differ, instead of littering the main code with whatever overhead you need to emulate something else.
Code: Select all
returnvaltype look_ma_one_func_only(void (*f1)(), void (*f2)(), void (*f3)())
{
opcode_init();
if (x)
{
reinit_flags();
f1(); // hahahahahaha
}
else
{
save_flags();
f2();
}
f3();
opcode_cleanup();
}
Code: Select all
<jmr> bsnes has the most accurate wiki page but it takes forever to load (or something)
I based them upon observations of the delay to latch counters through $2137 reads (2 clocks of 6) and writes to $4201 (6 clocks of 6).You wouldn't happen to have measured these in ppu ticks or some other workable time unit, would you ? ^^
Share plz. PM is fine. kthxbai.Oh, I did omit to mention this is not for ZSNES at all, which could lead to confusion.
Operator overloads are great, so long as you use them responsibly.(i'm looking at you, operator overloaders)
Code: Select all
if(!strcmp(config::driver.video.cstr(), "direct3d"))
if(config::driver.video == "direct3d")
If the C preprocessor wasn't such a horribly useless piece of shit, perhaps.If you have a situation where you need the same operations done at the exact same state in each function, you can clean up the base logic with an intermediary language, and focus on where the functions differ, instead of littering the main code with whatever overhead you need to emulate something else.
Damn C people :POr you can spend time doing awesome instead of writing new languages
On the bright side, you have them ready for some ailurophagy.But now I got the pincers and spoons ready for nothing, geh.
Oh? I didn't realize you accepted the Christ Public License V2 to use those ... if you didn't, you'll be hearing from Jesus Stallman soon.You're not supposed to mention function pointers in the forum Evil or Very Mad
That's a programming tool reserved to us experts of ZSNES.
That sounds like something I would do...byuu wrote: If the C preprocessor wasn't such a horribly useless piece of shit, perhaps.
As it stands, I'd rather write in a DSL and chain a translator at compile-time.
Thanks.byuu wrote:thus:
FastROM: hold 2, wait 4 = 6
SlowROM: hold 4, wait 4 = 8
XSlowROM: hold 8, wait 4 = 12
FastROM write: hold 6
SlowROM write: hold 8
XSlowROM write: hold 12
Nothing to share. Cannot comply.Share plz. PM is fine. kthxbai.
Issue: logic dictates that txs (0x9a) belongs in this list. Is it an exception ?Code: Select all
//immediate, 2-cycle opcodes with I/O cycle will become bus read //when an IRQ is to be triggered immediately after opcode completion //this affects the following opcodes: // clc, cld, cli, clv, sec, sed, sei, // tax, tay, txa, txy, tya, tyx, // tcd, tcs, tdc, tsc, tsx, tcs, // inc, inx, iny, dec, dex, dey, // asl, lsr, rol, ror, nop, xce.
Code: Select all
<jmr> bsnes has the most accurate wiki page but it takes forever to load (or something)
Nothing to share. Cannot comply.grinvader wrote:Issue: logic dictates that txs (0x9a) belongs in this list. Is it an exception ?
XBA works as expected, this doesn't apply. Seriously, what is it with ZSNES devs and secrecy? It's very rude :P
Also, the behaviour of XBA would be interesting, given it's another 'implied' opcode yet 3 cycles.
and neither byuu or grinvader are ZSNES experts?Nach wrote:That's a programming tool reserved to us experts of ZSNES.
grinvader is indeed a ZSNES expert, that doesn't mean he should give away our trade secrets on the forum though.adventure_of_link wrote:and neither byuu or grinvader are ZSNES experts?Nach wrote:That's a programming tool reserved to us experts of ZSNES.
fixedNach wrote:members who have altered a significant % of their brain tissue.
Code: Select all
<jmr> bsnes has the most accurate wiki page but it takes forever to load (or something)
That's gratitude for you. I've submitted hundreds of kilobytes worth of code for the massive debugger upgrade, which you can see here (my contribution in red):Nach wrote:byuu sadly is not a "ZSNES Expert", he may be an expert in using ZSNES, but he is not a member of the ZSNES Expert Stronghold Alliance, members who have altered a significant % of ZSNES' codebase.
That sounded like it were reserved for some only (well, actually from other point of view - true).grinvader wrote:fixedNach wrote:members who have altered a significant % of their brain tissue.
Tiny drop in the sea?byuu wrote:That's gratitude for you. I've submitted hundreds of kilobytes worth of code
Here is what I came up with in my old opcode based CPU emulator in TE to simulate pipeline and I haven't seen it fail anywhere yet:It seems more sugar can give me access to an opcode-based solution. Will look into it.
Code: Select all
void cli(void)
{
if( status & s_i )
{
status &= ~s_i;
if(!irq_pending)
irq_pending = 2;
}
}
.......
void cpu_emulate()
{
while(cycles > 0)
{
// execute instruction
jump_table[read_op()]();
if ( irq_pending )
{
if ( irq_pending == 1 )
{
if ( !(status & s_i) )
{
irq_pending--;
take_irq();
}
} else
{
irq_pending--;
}
}
}
}
Hate to necrobump, but that picture is one of the most awesome things I've seen on this forum.byuu wrote:That's gratitude for you. I've submitted hundreds of kilobytes worth of code for the massive debugger upgrade, which you can see here (my contribution in red):Nach wrote:byuu sadly is not a "ZSNES Expert", he may be an expert in using ZSNES, but he is not a member of the ZSNES Expert Stronghold Alliance, members who have altered a significant % of ZSNES' codebase.
Then there was the several months I spent laboring -- researching and rewriting the InitSNES2 function from ASM to C ...
And that's to say nothing of my complete rewrite of the ZSNES font interface to support variable-width letters that was rejected. As seen below:
I posted something about this some time ago... (Ah. Here it is.)byuu wrote:I haven't written up any documentation on it, sadly.
The basic idea is that there are bus cycles and work cycles. Bus cycles read from and write to memory; and work cycles do things like modify registers, set flags, etc.
Basically, the SNES performs bus cycle N while performing work cycle N-1. The latter is always one behind. Makes sense ... do stuff with data you have read while you wait on future data to be read.
(Pretend there's a cheesy picture of a conveyor belt here with the bus section handing things to the work section)
It's only noticeable due to the race conditions exposed through opcodes like cli that modify I on the very last cycle. If the processor were one-stage, the I flag clear would have an effect on the IRQ trigger test. But since it's two-stage, it appears to be "delayed" by one opcode (really one cycle, but as it's the last cycle, it effectively pushes it forward one whole opcode before another IRQ can possibly trigger.)
...
Code: Select all
What really happens...
InsnA-Mem1
InsnA-Mem2 InsnA-Exec1
InsnA-Mem3 InsnA-Exec2
InsnB-Mem1 InsnA-Exec3 <-- the overlap cycle
InsnB-Mem2 InsnB-Exec1
... InsnB-Exec2
Code: Select all
What actually happens on | However, we find it convenient
hardware looks like this: | to emulate it like this:
|
... | ...
InsnA-Mem1 | --------------
-------------- | InsnA-Mem1
InsnA-Exec1 <-- (same | InsnA-Exec1
InsnA-Mem2 <-- time) | --------------
-------------- | InsnA-Mem2
InsnA-Exec2 | InsnA-Exec2
InsnA-Mem3 | --------------
-------------- | InsnA-Mem3
InsnA-Exec3 | InsnA-Exec3
InsnB-Mem1 | --------------
-------------- | InsnB-Mem1
InsnB-Exec1 | InsnB-Exec1
InsnB-Mem2 | --------------
-------------- | InsnB-Mem2
InsnB-Exec2 | InsnB-Exec2
... | --------------
Code: Select all
<jmr> bsnes has the most accurate wiki page but it takes forever to load (or something)