You're exactly right. If the timer that controls the phases of animation didn't elapse exactly when it did, then no softlock would have occurred. If it didn't happen at that exact frame, then dragon_arm_orb_routine_04 would have run the next frame to keep the enemy state correct.
This bug is somewhat similar to the bug that allows the level 4 boss to remain vulnerable while uncombined (Level 4 Boss Gemini Vulnerability). When a delay timer is decremented from 1 to 0, the code should make the boss invincible by calling disable_enemy_collision. However, when the bullet collision happens at the right time, then the regular game code will decrement the timer to 1, and then next frame will have a different routine set the timer to 0, but that routine doesn't call disable_enemy_collision, leaving the boss in a vulnerable state.
So, I think we should be looking at places where timers are involved along with enemy state changes. If there is any luck, then there is somewhere that can transition an enemy to a death state routine earlier than expected.
I didn't forget about this, but did just now get some time to spend on it and found the root cause. The reason this freeze happens is due to a race condition where the left 'hand' (red dragon arm orb) is destroyed, but before the next frame happens where the 'orb destroyed' routine is executed, another orb on the left arm changes the routine of the left 'hand' to be a different routine. Since the expected 'orb destroyed' routine wasn't run, the rest of the arm didn't get the notice to self-destruct. Then, a few frames later, the left shoulder creates a projectile, which takes over the same slot where the left 'hand' was. Finally, one frame later, when the left shoulder tries to animate the arm, the left 'hand' not having correct data (because it is now a projectile), causes the game to get stuck in an infinite loop.
Here is my best attempt to provide a clear detailed explanation. Below is a diagram of the dragon boss and its arm orbs. Each number below is the enemy slot, i.e. the enemy number. #$06 and #$05 are the left and right 'hands' respectively, and are red. #$0d and #$0a are the left and right 'shoulders' respectively. () represents the dragon's mouth and is uninvolved in this bug. In fact, only the left arm is involved in this bug.
06 08 0c 0f 0d () 0a 0e 0b 07 05
1. Frame #$aa - Enemy #$06 (the left 'hand') is destroyed, the memory address specifying which routine to execute is updated to point to `dragon_arm_orb_routine_04`.
2. Frame #$ab - Enemy #$0f has a timer elapse in `dragon_arm_orb_routine_02`. Enemy #$0d updates the enemy routine for all orbs on the left arm. It does this by incrementing a pointer. Usually, this updates the routine from `dragon_arm_orb_routine_02` to `dragon_arm_orb_routine_03`. However, since arm orb #$06 (the left 'hand') was no longer pointing to `dragon_arm_orb_routine_02`, but instead to `dragon_arm_orb_routine_04`, incrementing this pointer, set #$06's routine to `enemy_routine_init_explosion`.
3. Frames #$ac-#$d1 - The animation for the left 'hand' explosion completes and the 'hand' is removed from memory (`enemy_routine_remove_enemy`)
4. Frame #$d2 - The #$0d (left shoulder) decides that it should create a projectile. The game logic finds an empty enemy slot where the left 'hand' originally was (slot #$06). A bullet is created and initialized. This initialization clears the data that linked the hand to the rest of the arm, in particular `ENEMY_VAR_3` and `ENEMY_VAR_4`.
5. Frame #$d3 - When #$0d (left shoulder) executes, it animates the rest of the orbs to make an attack pattern. It loops down to the hand by following the links among the orbs. When it gets to the hand, it expects that the hand's will have its `ENEMY_VAR_3` set to `#$ff` indicating there aren't any more orbs to process. However, since the enemy at slot #$06 is no longer a hand, but instead a projectile, the value at `ENEMY_VAR_3` has been cleared and is #$00. This causes the logic to get stuck in `@arm_orb_loop` as an infinite loop.
Step (2) caused `dragon_arm_orb_routine_04` to be skipped. Since this routine was not executed as expected, the rest of the arm didn't get updated to know that the 'hand' was destroyed. `dragon_arm_orb_routine_04` is responsible for updating each orb on the arm to be begin its self destruct routine. However, that never happens. So, the shoulder doesn't know to destroy itself.
Instead the shoulder operates as if it wasn't destroyed and when it decides that a projectile should be created, that overwrites the hand with a different enemy type, and clears all the links between the hand and the arm.
Regarding the usefulness of this bug for exploitation, the underlying issue was that one enemy updated the routine of another enemy without checking which routine that enemy was on. In this instance, and like almost all of Contra, it wasn't a pointer to a memory address that was updated, but instead an offset in a table of pointers. It'd be hard to find a place in code where incrementing/decrementing the index into a table causes you to be able to control code execution, but it is something I will look into.
vermiceli, this is absolutely amazing! Great work.
Back when I saw Post #499433 I tried doing some reverse engineering, starting from Trax's disassembly. I gave it up because I was not making much progress. I got as far as identifying, as you did, that ENEMY_VAR_3 and ENEMY_VAR_4form a doubly linked list of the dragon tentacle arm orbs. As I recall, the freeze in Post #499433/Post #499462 was caused by one of the lists somehow developing a cycle (one of the elements pointing to itself). I had hoped it was some wild PC branch, which might offer possibilities of code execution, but it was just an infinite loop over a linked list that had failed to maintain one of its invariants, probably in @enemy_orb_loop or nearby.
Anyway, this disassembly probably opens a lot of doors, to better romhacks, mods, randomizers… pretty exciting.
Thanks for the compliments. I hope it does open the door to some cool romhacks. I never looked into the freeze on the dragon tentacle orb, but the infinite loop on that list makes sense to me. I wonder what would cause the misconfigured linked list. Is it reproducible from the fm2 file?
After the Summoning Salt video showing the level skip bug, I decided to start with Trax's disassembly and take it further. I've made a complete disassembly of Contra with proper labels which allows for modification without breaking jumps and branching. It includes supplemental documentation, diagrams, lua scripts, and tooling to build the rom. This code is on github at https://github.com/vermiceli/nes-contra-us/
I didn't see anything obvious that would cause the level skip bug and I agree with Alyosha that it probably was an NMI before some appropriate memory could be cleared. I think that this repo would help anyone who's interested in the bug as it's much easier to read and parse.