Hi, Finalfighter and I just discovered what happens in Rockman 2 when it scrolls the screen the wrong way: Scrolling to down when the screen is supposed to only scroll horizontally, or scrolling to the right when the screen is supposed to only scroll vertically.
As we knew so far, it happens when lag occurs at the exact same moment when Megaman is on the edge of the screen, about to plunge to his death.
When lag occurs sufficiently, the screen scrolls down instead of Megaman dying nonchalantly.
So far, we hadn't been able to figure out why it does not always happen when lag occurs: you could have dozens of enemies and Item1s on the screen, the game would be running at half of its normal speed, and yet Megaman would die at the bottom of the screen.
Here's what happens normally when Megaman is at the bottom of the screen.
1) The game loads CurrentOrderNum (memory address $38), which indicates the index into the level data which determines the layout of the rooms in the stage.
2) The game reprograms the mapper in order to access the memory bank in which the level data is located.
3) The game reads the value of level data indicated by CurrentOrderNum.
4) If the value loaded equals 3, then the screen may scroll down. Otherwise Megaman dies.
Here's what happens when the screen scrolls unexpectedly.
1) The same as above. The value is placed in register Y.
2) The same as above. However, during the mapper reprogramming, the NMI occurs.
3-4) The same as above.
Why does this change the circumstances?
Let us observe more closely what happens:
2a) The mapper reprogramming routine sets up a flag indicating that it is reprogramming the mapper.
2b) The routine starts reprogramming the mapper.
2c) The NMI occurs. Because the mapper was just in the process of being reprogrammed, the NMI does not change the mapper (for it's not re-entrant), and returns very quickly. The NMI sets up a flag indicating that the sound code was NOT run. (For the sound code is located in another bank and to run it, the mapper would need to be reprogrammed.)
2d) The mapper reprogramming routine notices that the sound code was not run, and therefore chooses bank 12 and runs the sound code. Then it chooses the requested bank again.
2e) The mapper reprogrammer returns, and code continues from step 3 as above.
Somewhere here, the value loaded in Y (indicating CurrentOrderNum) changed.
The thing is, when the mapper reprogrammer runs the sound code (that is normally run by the NMI), it does not restore the register contents. The sound code clobbers the value of Y, and it does not restore it.
So when the code continues from step 3, it has got the wrong value in register 3. When the level data is accessed, it accesses different room data than it would normally do.
This is the core of the error.
Why does it happen? Well, during Q&A, the game testers probably noticed that in some circumstances, when the game slows down, the music also slows down. The music slows down because the music code, which is normally run by the NMI, cannot be run while the mapper is being reprogrammed. Since music slowing down is a sign of a Badly Programmed Game, they added extra code in the mapper reprogrammer to run the sound code if the NMI skipped it. However, they failed to restore the register contents when sound code is run in this exceptional way.
So that is how the trick works.
So how do we achieve it?
Turns out, it is exactly as difficult as we have observed.
Here is a disassembly of the mapper reprogramming routine:
SwitchBank:
[1E]C000: 85 29 sta DesiredMapperPage [$0029] A=0E X=2A Y=00 S=F9 P=24
[1E]C002: 85 69 sta CurrentMapperPage [$0069] A=0E X=2A Y=00 S=F9 P=24
[1E]C004: E6 68 inc MapperProgramming [$0068] A=0E X=2A Y=00 S=F9 P=24
[1E]C006: 8D F0 FF sta $FFF0 [$FFF0] A=0E X=2A Y=00 S=F9 P=24
[1E]C009: 4A lsr A=0E X=2A Y=00 S=F9 P=24
[1E]C00A: 8D F0 FF sta $FFF0 [$FFF0] A=07 X=2A Y=00 S=F9 P=24
[1E]C00D: 4A lsr A=07 X=2A Y=00 S=F9 P=24
[1E]C00E: 8D F0 FF sta $FFF0 [$FFF0] A=03 X=2A Y=00 S=F9 P=25
[1E]C011: 4A lsr A=03 X=2A Y=00 S=F9 P=25
[1E]C012: 8D F0 FF sta $FFF0 [$FFF0] A=01 X=2A Y=00 S=F9 P=25
[1E]C015: 4A lsr A=01 X=2A Y=00 S=F9 P=25
[1E]C016: 8D F0 FF sta $FFF0 [$FFF0] A=00 X=2A Y=00 S=F9 P=27
[1E]C019: A9 00 lda #$00 A=00 X=2A Y=00 S=F9 P=27
[1E]C01B: 85 68 sta MapperProgramming [$0068] A=00 X=2A Y=00 S=F9 P=27
[1E]C01D: A5 67 lda SoundCodeSkipped [$0067] A=00 X=2A Y=00 S=F9 P=27
[1E]C01F: D0 01 bne $C022 A=00 X=2A Y=00 S=F9 P=27
[1E]C021: 60 rts <-- this branch is normally taken
[1E]C022: A9 0C lda #$0C <-- this branch is taken when the glitch happens
[1E]C024: 8D F0 FF sta $FFF0 [$FFF0] A=0C X=2A Y=0E S=F9 P=24
[1E]C027: 4A lsr A=0C X=2A Y=0E S=F9 P=24
[1E]C028: 8D F0 FF sta $FFF0 [$FFF0] A=06 X=2A Y=0E S=F9 P=24
[1E]C02B: 4A lsr A=06 X=2A Y=0E S=F9 P=24
[1E]C02C: 8D F0 FF sta $FFF0 [$FFF0] A=03 X=2A Y=0E S=F9 P=24
[1E]C02F: 4A lsr A=03 X=2A Y=0E S=F9 P=24
[1E]C030: 8D F0 FF sta $FFF0 [$FFF0] A=01 X=2A Y=0E S=F9 P=25
[1E]C033: 4A lsr A=01 X=2A Y=0E S=F9 P=25
[1E]C034: 8D F0 FF sta $FFF0 [$FFF0] A=00 X=2A Y=0E S=F9 P=27
[1E]C037: 20 00 80 jsr SoundCode <-- clobbers the value of Y
[1E]C03A: A6 66 ldx $66 [$0066] A=00 X=0E Y=1C S=F9 P=26
[1E]C03C: F0 0A beq $C048 A=00 X=00 Y=1C S=F9 P=26
[1E]C048: A9 00 lda #$00 A=00 X=00 Y=1C S=F9 P=26
[1E]C04A: 85 67 sta $67 [$0067] A=00 X=00 Y=1C S=F9 P=26
[1E]C04C: A5 69 lda CurrentMapperPage [$0069] A=00 X=00 Y=1C S=F9 P=26
[1E]C04E: 4C 00 C0 jmp SwitchBank [$C000] A=00 X=00 Y=1C S=F9 P=26
The window of opportunity is such, that the NMI must occur within a 10 instructions window: Addresses C006 .. C019.
If the NMI occurs when PC ≤ C004, or PC ≥ C01B, the glitch won't occur.
Incidentally, this is exactly 25 instructions before the main loop returns waiting for the next NMI.
So it requires a
very small lag: 25..34 instructions too much for the CPU to execute during the current frame.
The values that the sound code may leave in the Y register seem to include:
* Predominantly 1C.
* Value 01 every 10th frame or so.
* Other values such as 09, 0B, 0F at different moments (sound effects seem to play a role in this)
1C being the most common value is good news. Let's look at the level data:
Map 0
00003400 47 40 4a 40 20 20 00 88 80 80 80 80 80 04 00 00 |G@J@ ..........|
00003410 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
Map 1
00007400 49 40 28 20 00 45 40 40 40 45 40 40 00 00 00 ff |I@( .E@@@E@@....|
00007410 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
Map 2
0000b400 44 40 20 20 20 80 80 44 40 40 40 22 20 00 40 40 |D@ ..D@@@" .@@|
0000b410 40 44 40 40 40 40 22 00 00 00 ff ff ff ff ff ff |@D@@@@".........|
Map 3
0000f400 44 40 40 40 26 24 20 00 80 80 80 80 80 80 41 40 |D@@@&$ .......A@|
0000f410 40 40 40 23 00 00 00 ff ff ff ff ff ff ff ff ff |@@@#............|
Map 4
00013400 40 40 40 40 40 40 40 44 40 40 40 40 40 40 40 22 |@@@@@@@D@@@@@@@"|
00013410 20 00 00 00 00 00 00 ff ff ff ff ff ff ff ff ff | ...............|
Map 5
00017400 46 40 40 40 40 40 40 24 20 00 40 40 40 25 00 00 |F@@@@@@$ .@@@%..|
00017410 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
Map 6
0001b400 49 40 28 20 00 00 00 ff ff ff ff ff ff ff ff ff |I@( ............|
0001b410 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
Map 7
0001f400 80 80 81 80 80 80 81 80 80 80 80 80 80 80 21 20 |..............! |
0001f410 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
These are the tables indexed with the Y register value at step 3. There are 8 tables, for 8 stages. (The Wily stages reuse the tables for the 8 basic stages.)
As you can see, in each of these tables, the value at offset 1C equals FF. Why is FF is good?
Because it is a bitmask indicating that the room can scroll into
any direction: left, up, right (shutter), and down.
Okay, so how to reliably make the scrolling happen? I have no idea. Everything can affect it -- including the playing position in the music!
We know that
1) Although Mega Man 1 uses the same sound engine as Mega Man 2, this bug does not happen in Mega Man 1, because Mega Man 1 uses a mapper that can be reprogrammed in an atomic (uninterruptable) manner, and therefore there's no need to run sound code in non-NMI context. (It is unknown whether MM3-6 are vulnerable, but since they use a radically different codebase, probably not.)
2) This glitch is not limited to scrolling. In fact, any time the SwitchBank routine is invoked by a code that does not expect the Y register value to change, something weird can happen, if only the NMI occurs at the right moment!