Ok. After several weeks of research and testing, I've finally finished making my improvements to the TAS currently on the workbench. My new TAS of The Lion King for the NES can be found here:
http://tasvideos.org/userfiles/info/57552889267614864
FCEUX can't record AVIs on my computer, so since I can't provide an encode here, i'll summarize the improvements in this TAS for anybody who doesn't want to bother watching this in an emulator:
Improvements for Each Level:
1. Level 1 is the same speed as it was in my original TAS. In all the new submissions i've made of this game, I still haven't found any timesaves in this level.
2. Level 2 saves about 20-30 frames using strategies that I discovered from TASing the gameboy version of the game. Namely, it jumps over the first Rhino's tail in the section with three rhinos in a row, which saves some time. This added time save allows me to use the next rhino's tail as a faster way of getting boosted up to the monkey, which saves even more time.
3. In this level, I use the bones that Simba can swing from to save time over rolling on the ground. Additionally, I manipulate the vultures even faster than I did before, allowing me to do a quicker double vulture boost. This all saves about 15-30 frames overall.
4. This level is an autoscroller. No timesave to be found here...
5. This level had a lot of small timesaves scattered throughout it. Most of them involved rolling in the last few sections of the level instead of jumping, since rolls are faster than jumps in this game. Additionally, I jumped at a certain drop instead of falling, since this lets you redirect your speed, which allowed me to move to the right while falling and save about 10 frames. All in all, I saved somewhere between 25 and 35 frames in this level.
6. The first 5 levels took me about a day or two to TAS. This level is what I've been stuck on for the last month. I saved 6 frames in the beginning of the level by improving my movement slightly, and everything seemed to be going well. After that, I reached the waterfall...
The main issue with the waterfall is that the logs that fall down spawn based on a global timer. In my old TAS, I got essentially the fastest log pattern possible, with it only taking 380 frames to climb the waterfall (for comparison, in all of my other TASes of the game, it took me around 440 frames to climb the waterfall). No matter what I did, I kept finishing the waterfall section in about 480 frames, which was slow enough that I was tied with my original TAS when I finished the waterfall section. I had to find a better way to manipulate the logs in my favor. And so, I set about to figure out the basics of how the log's work.
I'm not the best at deciphering assembly language functions, but I did make some basic discoveries based on my analysis:
1. The number of frames since power on is stored in addresses 0x560 - 0x561. If you use a hex editor to alter either of these values, it changes the pattern of the logs that you get.
2. The actual y-coordinates of the logs themselves are stored in 5 addresses from 0x665 - 0x669
3. There are only 5 possible x-coordinates that the logs can have
4. All new logs load from the top of the screen, and move downwards to the bottom of the screen. When all 5 logs have loaded, if you go up high enough, you can see an earlier log higher up (since the logs only store the least significant byte of their y-position, which wraps around when you reach the next byte of the y-position). However, if all 5 logs are in the center of the screen, one of them most reach the bottom of the screen before another log can load.
5. When a log is loaded, it's y-position is set equal to the value stored in address 0x536 minus 16 (base 10 number).
6. The value stored in 0x536 increases while you are jumping up and decreases while you are falling down. Additionally, if you jump to a different platform with a different y-position, it can change the value stored in 0x536.
7. The moment when a log is loaded is determined by 2 things: it must first be possible to load a log (all 5 can't already be on screen), and if that is the case, then the value in address 0x57D must count down to 0 before the log can load.
0x57D is usually set to 0, but at certain points, (probably based on the global frame number) it gets set to 5 or 15. From there, it counts down by 1 every 3 frames. When it hits 0, the next time you move upwards a new log will load.
My main goal in finding a way to manipulate the log's pattern was to find a way to alter the value stored in address 0x57D. However, I couldn't find a consistent way to do that, so I was left with using trial and error to do the best that I could. Eventually I was able to get up the waterfall in 445 frames, which is the final value it has in this new TAS.
Adding up all the timesaves and timelosses, this run finished about 0.4 seconds faster than my original submission.
If anybody wants to try to improve this time, here are 3 places that could potentially be optimized to save time:
1. The waterfall.
2. The waterfall.
3. The waterfall.
The waterfall probably has about 1 more second of possible timesave available for somebody who figures out how to precisely manipulate where the logs spawn. Other than that, I'm fairly content with this new TAS, and am requesting that this submission be set back to judging underway, and that my new TAS replace my original submission.