This movie uses the same exploit used for the game end glitch to achieve total control and do some fun stuff. As I submit, dwangoAC is showing a slightly modified version on console for Awesome Games Done Quick 2016 [dead link removed]. Watch the movie before continuing if you don't want spoilers!

Goals

  • Colors a dinosaur
  • 1st TAS to get 99 lives for both Mario and Luigi
  • There were no other goals
I discovered how to achieve total control of SMB3 shortly after Tompa and I published the game end glitch version in 2014. I had it in mind to do a playaround movie integrating new custom powers with regular SMB3 gameplay.
A major challenge was the fact that almost all of my programming experience was in higher level languages like Java -- I knew next to nothing about assembly language or the NES architecture: terms like PPU, nametable and bank switching meant nothing to me.
Thanks to the many excellent web resources available for NES programmers, I worked my way up the learning curve and implemented increasingly sophisticated pieces of new functionality. Along the way, true encouraged me to tweak things so they'd verify on console, a process which required some rewrites to use the PRG1 ROM instead of PRG0.
I had a lot of code written, but no real plan as of Oct 2015 when dwangoAC contacted me about including SMB3 total control in Awesome Games Done Quick 2016. Having a deadline to work against forced me to focus and finally get this done.

Powers

When I first thought about doing this run, I was intrigued by the idea of pairing custom/augmented gameplay features with ordinary gameplay, so that was the starting point for much of what I ended up implementing.

Back Door

I thought it'd be fun to pretend that SMB3's legendary lead developer, Shigeru Miyamoto, had left an intentional "back door" hidden in the game which we had only just now discovered. Presented as an 80's style shell interface, the back door pretends to allow commands (with options) to be entered to enable the otherwise impossible happenings that follow.

Color a Dinosaur

Tasvideos.org has a bit of a running joke about the coloring of dinosaurs, ever since a couple of April Fool's submissions for an actual NES game called 'Color a Dinosaur' a few years back. SMB3 is widely felt to be the greatest NES game of all time, so I thought its first task under total control should be an homage to a game of more modest fame.
All of the coloring actions – change color, move the cursor, draw, and start over can be done by a human in realtime if anyone wants to try to top my masterpiece.

Speed Boost/Shine Spark

I'm a big fan of SNES Super Metroid and both its real time speedruns and TASs. I though it would be cool to create a speed mechanic for SMB3 as much as possible like how the speed boots/shine spark work in Super Metroid. This effect required several hundred bytes of code, hence the prolonged loading time, and was technically the most difficult to implement. It is quite easy to play around with in realtime, especially on levels like 2-4 that are big both horizontally and vertically.

Suit-Swap

The suit swap is a very simple but versatile bit of code that gives Mario a new suit based on the game's timer. As the player has no way of knowing the current timer value, the effect in realtime play is to choose a random suit. Under TAS conditions, of course, I can choose whichever suit I want based on which frame I activate the effect.
Of note, there are only 7 legal suits, but the Suit Swap rolls over every 8th frame. Assigning the 8th suit gives the surprising result of Mario's sprites being a complete jumble when standing, invisible when ducking, and the game physics treating him as if he's swimming! I use it a couple of times in this movie just to show the bizarre behavior.

Shell Shield

The shell shield wasn't based on any particular inspiration; I just thought it'd be cool. It was pretty easy to write, and involves using one of the 5 enemy slots for an invincible koopa shell that follows Mario around, killing enemies and laying waste to destructible blocks. The shell shield can be activated/deactivated with the select button. When deactivated, the shell becomes an ordinary, destructible koopa shell that can be stomped and grabbed if desired…though it can also hit you so you need to carefully time when you release it! It's my favorite power to play around with in realtime.
Mario will stomp the shell shield when falling, allowing him to essentially fly if he wishes by holding the a/jump button while in the air to maximize the boost achieved with each stomp. This was an actual case where a bug turned out to secretly be a feature – I did not intend this behavior, but liked the flexibility it provides so much that I left it in.

PwnScroll

Auto-scrolling levels are a dominant feature of most every SMB3 realtime speedrun and TAS that doesn't use the game end glitch. Autoscrollers can be frustrating to TAS because they're so long, there's little if anything you can do to save time, and it's difficult to think of entertaining ways to spend all that time.
The World 8 autoscrollers have a special place in my heart due to being difficult (for me, at least) in realtime play, and necessary for all non-glitched SMB3 TASs (since even with warps you still have to do world 8).
That made it a priority to imagine what the game might look like if Mario could control when and where the screen scrolled, or if the autoscrollers functioned as regular levels. Both are demonstrated here.
Watching Mario jump past the tank at full running speed may not look all that interesting to those less familiar with the game, but for someone who's spent hours watching those tanks scroll slooooowly by, there's something magical in it! (It might be worth querying Mitch or another realtime player on their thoughts on the autoscrollers and their pwnage).
The autoscroller functions can be used in realtime, but it's easy to crash the game or accidentally die. Disabling autoscrolling at the beginning of a stage seems pretty safe, at least for 8-Tank!

Anti-Gravity

This power was somewhat inspired by something you can do in NES Battle of Olympus to flip upside down and walk on the ceiling. I debated for a long time which level would best showcase this, and ended up on Bowser's Castle because it's prominently vertical and I found a surprising shortcut it enables – clipping into the wall as small Mario running upside down just above the lava. Because of the size and complexity of real collision detection in SMB3, this power is a total hack and not usable in realtime, though I have to say I like how it looks in the TASs I've done with it.

Bowser

I've played around with Bowser during previous TASs, doing things like fighting him with star invincibility active (he dies, but the game hangs). I was pleasantly surprised to see that the shell shield kills him with a single blow and actually proceeds to the ending (though in the interest of time I manually jump to the ending several seconds early).

Ending

Pretty much every TAS I've watched on YouTube has commenters denouncing videos clearly labeled as tool-assisted as fakes, cheats, etc, so I thought it'd be funny if Mario had to face a similarly skeptical Princess.
As a side note, the Princess's normal text here is hard-coded in the ROM. Because of how total control is achieved in SMB3, it can take a lot of effort to change something like that to read from customized RAM instead – in this case a few hundred bytes of code/data are required.

Bootstrapping to Total Control

As explained in the game end glitch run, touching the glitch tile, an invisible note block, makes the processor try to update memory outside of the normal tile data, at an address ($9C70) that reprograms how the processor interprets addresses. This causes execution to jump to an unintended area of the ROM and execute incorrect instructions due to byte misalignment. Eventually, the stack overflows and it starts executing RAM instructions starting at address $0081, which is just before the location of the player x value at $0090. The 10 bytes manipulated to boostrap total control are koopa x values at $0091-5, and koopa walking counters at $9A-9E.
For the game-end glitch, only a single assembly instruction was required: JSR $8FE1, a subroutine jump to the ending. This code was "written" by careful gameplay to manipulate enemies into x positions (20, 143, 225) that would generate the above instruction.
Total control starts with this loop at frame 11287, with execution starting at RAM address $0091:
$0091: JSR $96E5 ; subroutine that waits until frame ends; NMI updates controllers & $15 counter
$0094: LDA $F7   ; load controller 1 state
$0096: BRK       ; do nothing 
$0098: BRK       ; do nothing
$009a: INX       ; increment X register
$009b: STA,X $7d ; store controller data to next spot in RAM
$009d: BNE $0091 ; if X isn't zero, loop to $0091
The code is written around what the initial value of the X register is expected to be, and starts writing code at $0098: within the loop itself! This was necessary because $98 and $99 can't be manipulated by gameplay in 7-1. The modified loop will exit after the next phase of the total control program, written immediately following the initial loop, is ready.
$0091: JSR $96E5 ; subroutine that waits until frame ends; NMI updates controllers & $15 counter
$0094: LDA $F7   ; load controller 1 state
$0096: BRK       ; do nothing 
$0098: BEQ $009f ; if nothing pressed on controller 1, jump to $009f
$009a: INX       ; increment X register
$009b: STA,X $7d ; store controller data to next spot in RAM
$009d: BNE $0091 ; if X isn't zero, loop to $0091
Ultimately, the controllers are used to create general total control program that checks each frame for start+select on controller 2, and if detected uses the coming input to load a new payload into memory. If there's no new payload, it calls $0147 as a subroutine; payloads deposited here are the entry point for custom behavior.

My "IDE"

This project required thousands of instructions be introduced into memory using keypresses on 2 controllers. (Technical note: I could have cut loading times in half using 4 controllers via the NES 4-play device. I didn't pursue this because current replay devices don't support the 4-play, so console verification would not have been possible).
Obviously, there's no IDE that exists for the purpose of writing 6502 assembly code for output as an FCEUX movie file, so I did the best I could using Excel spreadsheets.
Each payload follows a standard format: a 4-byte header (2 frames of 2-controller input) indicates how many bytes (2-128) to write and to which RAM address. The controller presses that follow are read as the specified payload, 2 bytes per frame, and are written into RAM. Interestingly, because everything is happening in RAM, exactly the same process is used to inject code for execution as to, say, poke a memory value (e.g. set number of lives to 99).
My Excel setup used columns, formulas and the like to make it as easy as possible to lay out a payload in a standard way. For each payload, I'd hand-write the assembly instructions/data in human-readable form, then "compile" them myself into the correct bytes for each controller under columns C1/C2. The spreadsheet then automatically generates the correct FCEUX input which can be copied/pasted into the movie file.
Anyone who's curious can see my Excel IDE as of today, which includes most of the code used in the movie, and some (Zelda music code) that wasn't.
The above process worked pretty well. The major sources of error were typo's in converting instructions into bytes and forgetting to correctly update the header when a payload's size or location changed. Debugging was generally pretty straightforward using the FCEUX trace logger.

Need...Input...

One unexpected hurdle in making this was the need to make a custom input poller. The NMI is nice enough to load the controller data each frame, but SMB3 actually has logic to detect illegal button presses like L+R and U+D and zeroes the d-pad input out in those cases! That doesn't sound like a big deal, but when you're looking to write code, it's nice for each controller to be able to have 256 possible states (1 byte/controller/frame) rather than only 144. So one of the earliest tasks once I start setting up Total Control is to write my own input routine and use that to free up all possible controller states.

Limitations RAM Hacking the NES

It turns out that "Total Control" doesn't equate to omnipotence: NES architectural features place limits on what can be accomplished via RAM hacking.
Chief among these is that the vast majority of the game's code and data resides in ROM - which cannot be modified. As a result, many tasks that would be trivial for a ROM hacker -- e.g. minor changes to a subroutine or tweaking music/sound effects -- present much greater challenges. For example, modifying a subroutine generally requires replicating much of the code leading up to the section of interest - which can be hundreds of instructions!
The way the PPU works also limits graphical changes. The PPU assembles the game's backgrounds and sprites from a gallery 8x8 pixel tiles called pattern tables. These exist in ROM, so while a ROM hacker could easily make substitutions to make bricks look exactly as they do in SMB1, or create a sprite from original graphics, that just isn't possible here.
So -- true total control requires ROM hacking, but one thing no ROM hacker can honestly say is that their hack runs on the original, unmodified hardware. :)

Missing the Cut

One payload that I omitted from this submission was code that would substitute whatever music the game was trying to play with a custom music track, namely the level 9 dungeon music from NES The Legend of Zelda.
This task is complicated greatly by the inability to modify the ROM, and while I was able to get the music playing, some sound effects degraded the sound in unexpected ways and in the available time for AGDQ I couldn't get it polished enough to include.
As music and sound effects have to play nice to share the same sound channels, I think I can get this working perfectly with more research into how the sound/music code interacts, but it's out for now.

Running SMB3 Gameplay Under Total Control

The ideal model for running total control functionality alongside the regular game engine would be if an infinite loop like this executed every frame:
while (true)
  doNormalSMB3()
  doTotalControlStuff()
end
Unfortunately, the game developers weren't so thoughtful: the game's main loop looks more like this:
A:
  (dozens of instructions)
B:
  (dozens of instructions)
  goto D (i.e. the JMP instruction)
C:
  (dozens of instructions)
D:
  (dozens of instructions)
goto A
So the bottom of the call stack has several hundred instructions spanning the title screen, world map, mini-games, and regular gameplay. Even worse, under many conditions (dying, completing a level, entering a pipe, etc) dozens of frames could elapse in some deeply buried subroutine, and under some conditions the code will manually jump to another section of the main loop, rather than returning normally. Without careful management, such behaviors can easily allow the game to "escape" total control, reverting to normal or (more likely) glitched/unstable behavior.
The structure of the main loop necessitated a lot of work to pare it down to something that could reasonably be replicated in RAM while preserving the ability to play the game. The world map and mini-games were among the pieces of functionality sacrificed to achieve this; hence the movie jumps from level to level via a "back door" rather than other approaches that might have been used. I have the code written to enable use of doors/tubes but didn't load it for this run to save space and because I found enough antics to pull in the levels proper.

Thanks

  • My family for understanding that hours of holiday time just *had* to be spent hacking a 27 year old game
  • Tompa - my SMB3 partner in crime
  • true - for nudging me to work on console verification
  • dwangoAC - for the AGDQ idea and making it happen
  • Southbird - for the incomprehensibly massive job of disassembling and fully commenting the game
  • 6502.org and Nesdev Wiki for reference material I used constantly

Nach: I was really excited to see this movie, it incorporates most of what I threw out there on SMB3 over a year ago (what no Kuribo's Shoe???). I even suggested to Masterjun again just last Thursday that he should make a run where Mario has VVVVVV's gravity powers, based on some earlier work. I am thrilled that someone finally made this. There's a ton of potential with this kind of run, and this is a terrific start to show off what's possible and really get the ball rolling.
There is the question as to whether this should obsolete the existing run which uses the same exploit to just simply jump to the end of the game, as this is way more entertaining. With the precedent set, even though I personally disagree with it, I will consider this as a branch onto itself.
This movie did a good job in catering to in-house jokes, being super entertaining for viewers everywhere, and had excellent choices made in the abilities it showed off. The new abilities added are the kind of thing you'd expect from a movie of this genre, and were spot on. The levels chosen to show off the new abilities are among the most iconic in the game and well recognizable and quite fitting for the abilities portrayed in each one. Throughout the custom parts added, a lot of attention to details for entertainment can be seen.
Aside from not providing an opportunity to show off Kuribo's Shoe, I can't think of anything negative about this run. This is by far one of the most entertaining movies ever submitted, with practically every bit of it carefully selected to provide maximum entertainment value. Accepting to Stars.
fsvgm777:
-> encode-this-movie
Unable to encode this movie
-> sudo encode-this-movie


WE ENCODING, GRAB THE FILES!