Dimon12321's comment: "When every known game can be played back, only then Duke Nukem can be played back too" got me thinking, what are the most challenging runs to console verify for each system? What would be the run that signifies pretty much any other run could be verified as well?
I guess there are a lot of factors to consider for such a question. Some games interface with special hardware so could be their own challenge, but aren't necessarily relevant to the rest of the game library. Some consoles / games have various levels of non-determinism, so could be difficult just in requiring extreme luck. In general you have to know a lot about the system and games library to really pick out the most challenging ones. Some games certainly require perfect emulation, but you have to find them case by case.
Anyway I thought the topic was interesting enough to write down my thoughts, might be interesting to have some goals to aim for.
NES: We might already be there with Bad Apple. It would hard to push both the system and the console verification tools harder than that. For a more standard TAS, there are many runs that require cycle accurate emulation of the entire system, so it's hard to pick just one, though Time Lord is particularly challenging.
GB/C: Here we have an interesting case of linked play. Pokemon Coop Diploma would probably the most complex case possible. For single system runs, the Pokemon yellow ACE might be it. For normal TASes, probably we are already there with Pokemon Crystal and Men in Black.
GBA: Currently the game that I know requires cycle accuracy for the entire run is Banko Kazooie: Grunty's Revenge. That game also needs quite a few difficult edge cases to be emulated to get that cycle accuracy. The GBA library is huge and I have tested very little of it, so there could be more challenging cases out there. Linked play is also something to explore.
Any other examples I missed?
These are the systems I have the most knowledge about. I'd be interested to hear from other people knowledgeable in other systems to learn about some of the challenging 'capstone' like examples there as well.
Ambassador, Moderator, Site Developer, Player
(155)
Joined: 3/17/2018
Posts: 358
Location: Holland, MI
It doesn't do as much generally for the N64 but rcombs' 15 hour Watch for Rolling Rocks .5 A Press verification is the longest I know of which is an achievement itself for the stability of the tooling.
Link to video
On GB I would also highlight the DK94 verification as an important one for the validation of procedures to stabilize uninitialized RAM.
I would also argue the whole PC TASing scene is quite interesting as to how it questions the very idea of what a verification is - usually it applies only to a single SKU of a system, eg the GBA/GBP variant of the Game Boy Color, but my Backyard Baseball TAS for example is replicable across systems of wildly different hardware capabilities. There's not really any interesting accuracy insight that could be added by eg sending the inputs over mock mouse/keyboard over usb instead of through libTAS since you're then validating lag behavior of the OS layer much more than the hardware. So it kinda shows how our concept of verification is tightly tied to these earlier bare metal systems without an OS layer or ones where the OS is very meager and without any/many revisions.
There's a 39 day SM64 TAS that's been verified, so that one is even longer:
Link to video
SM64 is actually one of the easiest N64 games to console verify, because its physics don't change if there are more or less lag frames, and it doesn't poll on lag frames, so if you just give it input on non-lag frames you're good to go.
There are much harder N64 games to console verify, because we can't even emulate them accurately, because to emulate their lag requires cycle accuracy and their lag changes physics. Goldeneye being a good example of this (at least the last time I checked)
SNES is also very hard to console verify, IIRC the reason is because two different components have their own timers and in any actual SNES their timers are slightly desynced due to physical inaccuracies.
Even if you have some perfect accuracy, you still end up screwed due to non-deterministic behavior. You have the bootup time of the game being randomized with a hardware RNG (thanks Nintendo), and then you have the clock relationships between the RCP (i.e. what controls many different components of the N64) and CPU being dictated by a PLL clock multiplier (which ends up having a similar in concept issue with SNES, although it's still one clock, just that clock being multiplied cannot have the multiplication done deterministically, so you get a issue similar with having 2 separate clocks).
N64:
Patashu beat me to it, but yeah, that was definitely not the longest. rcombs also successfully replayed a TAS of the pendulum crash bug in SM64, which was a little more than 39days long. Which is certainly impressive, and I recall there being some failed attempts beforehand due to interrupted power or something along those lines. However, SM64 isn't a challenging game to replay on hardware (timing inaccuracies in emulators don't affect replayability.)
Technical Explanation:
The game was programmed to poll for inputs one time per game loop, regardless of how long it took to complete each iteration. N64 games typically used a double or triple buffering scheme. The video output hardware is more or less automatic, with some level of configuration. It's either on, and outputting a framebuffer given its position in memory, or it's off completely. Games will render (rasterize) into one framebuffer, while the video hardware is outputting the other framebuffer. When the rendering is complete, games will typically wait for VSYNC, and then swap the buffers. "Lag frames" are really just the frames in which the game wasn't ready to swap the buffers yet, and so the video hardware displayed the previous framebuffer again.
The difference between SM64 and many other games, is that most games will poll controllers automatically every video frame (e.g. typically around VSYNC), and store the data in memory for the game to use whenever it wants. Replay devices aren't capable of knowing which poll actually gets used, nor when it should increment to the next input. SM64 polls once per game loop. So regardless of how long it takes to complete one iteration of logic and rendering, the game will poll only once, making TAS verifications extremely easy. Emulators don't have to be highly accurate.
Ultimately, not enough experimentation has been performed on the N64 to confidently say what would be the "capstone" of the system in regards to TAS verification. The most immediate issue continues to be inaccurate emulation. Sure, it's getting better, particularly with Ares, but there are still issues with emulating accurate timing which is critical for verifying many games on this system. However, even if Ares has reached a sufficiently accurate point for some games, we wouldn't know because extremely little testing has been done to understand what issues may exist or what games may be viable for verification.
NES:
I agree that Bad Apple really pushes the replay hardware in a variety of ways. However, I feel there are still some edge cases that don't come up in Bad Apple:
At least one of the Dragon Warrior games (maybe III or IV?) will start polling in one frame, be interrupted by VSYNC, and then resume clocking out inputs in the next frame. As I recall, Bad Apple by design avoids polling during VSYNC.
There's [2137] NES Spy vs. Spy by Sir VG in 08:46.52 which continuously polls for input. The current publication may not even be verifiable, idk. ViGrey and I both struggled to make sense of how to dump, let alone replay, inputs for that game.
It would be cool if more modern PC games offered built in TAS / replay tools, but its probably too niche for devs to bother.
My opinion on this is that either a hardware mod is needed, or some other new form of TAS is necessary for these systems. Maybe something like a scripting language + machine vision can account for non-determinism in real time? At least for SNES I have personally seen that verification is just generally not possible with the same pipeline as NES.
Oh this is a very interesting case that I was not aware of (or maybe just forgot.) You really never know which games will be troublesome until you look.
Of course accounting for everything that a console / game could do becomes a bottomless pit of time and effort, so maybe picking 'capstones' from amongst games that do 'reasonable' things only could tame the problem a bit.
This discussion reminds me how young a technology console verification still is.
SNES might become a lot more possible in the near future. rasteri has created a hardware mod (open source) that essentially synchronizes the APU (audio) clock to the CPU clock, which should improve how deterministic the console is. But more testing is needed.
Alyosha wrote:
This discussion reminds me how young a technology console verification still is.
Definitely! There's been amazing progress in the past few years, across many different systems. Yet, from my experience researching and testing, it's clear there are still many unknowns. But I think that's okay! It means there is much more to explore and learn.
SNES might become a lot more possible in the near future. rasteri has created a hardware mod (open source) that essentially synchronizes the APU (audio) clock to the CPU clock, which should improve how deterministic the console is. But more testing is needed.
Does it match an emulator configuration or should the emulator be adjusted to that?
Bigbass wrote:
There's been amazing progress in the past few years, across many different systems. Yet, from my experience researching and testing, it's clear there are still many unknowns. But I think that's okay! It means there is much more to explore and learn.
Like Sega Genesis, I suppose. I see, all verified movies were done years ago, and those were old Gens movies.
The last year, I asked in dwangoAC's Discord channel about verifying Genesis movies and someone responded that not much attempts were initiated to do that, so maybe there is another platform just waiting for its time.
TASing is like making a film: only the best takes are shown in the final movie.
SNES might become a lot more possible in the near future. rasteri has created a hardware mod (open source) that essentially synchronizes the APU (audio) clock to the CPU clock, which should improve how deterministic the console is. But more testing is needed.
Cool project! I'll definitely mod my SNES when its finished.
Does it match an emulator configuration or should the emulator be adjusted to that?
I don't know enough about it to say one way or another. I'd expect changing the emulator to match would be easier, but might invalidate existing movies. On the other hand, I don't know how flexible the mod is, and I don't know if emulators are even accurate to begin with.
Dimon12321 wrote:
Like Sega Genesis, I suppose. I see, all verified movies were done years ago, and those were old Gens movies.
The last year, I asked in dwangoAC's Discord channel about verifying Genesis movies and someone responded that not much attempts were initiated to do that, so maybe there is another platform just waiting for its time.
Actually I verified [570] Genesis Wonder Boy in Monster World by Aqfaq in 42:10.45 last year, despite it having been published in 2006. Though, it required me modifying the Lua API in Gens to expose necessary information. It was the first Genesis verification since 2014. I feel that there could be a lot of existing Genesis movies that are verifiable, they just need to be attempted. Without a flashcart though, I'm unable to do anything more on that console. While I do have a prototype design for my own flashcart, it's not functional yet.
Joined: 8/30/2020
Posts: 111
Location: Sydney, Australia
Bigbass wrote:
Dimon12321 wrote:
Bigbass wrote:
SNES might become a lot more possible in the near future. rasteri has created a hardware mod (open source) that essentially synchronizes the APU (audio) clock to the CPU clock, which should improve how deterministic the console is. But more testing is needed.
Does it match an emulator configuration or should the emulator be adjusted to that?
I don't know enough about it to say one way or another. I'd expect changing the emulator to match would be easier, but might invalidate existing movies. On the other hand, I don't know how flexible the mod is, and I don't know if emulators are even accurate to begin with.
The consensus in #tasbot-dev was to semi-arbitrarily pick a ratio which is close to what the original specsheet had, and promoting that as a universal standard (including in RTA circles).
I contribute to BizHawk as Linux/cross-platform lead, testing and automation lead, and UI designer. This year, I'm experimenting with streaming BizHawk development on Twitch. nope
Links to find me elsewhere and to some of my side projects are on my personal site. I will respond on Discord faster than to PMs on this site.
Hey look buddy, I'm an engineer. That means I solve problems. Not problems like "What is software," because that would fall within the purview of your conundrums of philosophy. I solve practical problems. For instance, how am I gonna stop some high-wattage thread-ripping monster of a CPU dead in its tracks? The answer: use code. And if that don't work? Use more code.