Alyosha
He/Him
Editor, Emulator Coder, Expert player (3822)
Joined: 11/30/2014
Posts: 2832
Location: US
Dimon12321's comment: "When every known game can be played back, only then Duke Nukem can be played back too" got me thinking, what are the most challenging runs to console verify for each system? What would be the run that signifies pretty much any other run could be verified as well? I guess there are a lot of factors to consider for such a question. Some games interface with special hardware so could be their own challenge, but aren't necessarily relevant to the rest of the game library. Some consoles / games have various levels of non-determinism, so could be difficult just in requiring extreme luck. In general you have to know a lot about the system and games library to really pick out the most challenging ones. Some games certainly require perfect emulation, but you have to find them case by case. Anyway I thought the topic was interesting enough to write down my thoughts, might be interesting to have some goals to aim for. NES: We might already be there with Bad Apple. It would hard to push both the system and the console verification tools harder than that. For a more standard TAS, there are many runs that require cycle accurate emulation of the entire system, so it's hard to pick just one, though Time Lord is particularly challenging. GB/C: Here we have an interesting case of linked play. Pokemon Coop Diploma would probably the most complex case possible. For single system runs, the Pokemon yellow ACE might be it. For normal TASes, probably we are already there with Pokemon Crystal and Men in Black. GBA: Currently the game that I know requires cycle accuracy for the entire run is Banko Kazooie: Grunty's Revenge. That game also needs quite a few difficult edge cases to be emulated to get that cycle accuracy. The GBA library is huge and I have tested very little of it, so there could be more challenging cases out there. Linked play is also something to explore. Any other examples I missed? These are the systems I have the most knowledge about. I'd be interested to hear from other people knowledgeable in other systems to learn about some of the challenging 'capstone' like examples there as well.
TiKevin83
He/Him
Ambassador, Moderator, Site Developer, Player (155)
Joined: 3/17/2018
Posts: 358
Location: Holland, MI
It doesn't do as much generally for the N64 but rcombs' 15 hour Watch for Rolling Rocks .5 A Press verification is the longest I know of which is an achievement itself for the stability of the tooling. Link to video On GB I would also highlight the DK94 verification as an important one for the validation of procedures to stabilize uninitialized RAM. I would also argue the whole PC TASing scene is quite interesting as to how it questions the very idea of what a verification is - usually it applies only to a single SKU of a system, eg the GBA/GBP variant of the Game Boy Color, but my Backyard Baseball TAS for example is replicable across systems of wildly different hardware capabilities. There's not really any interesting accuracy insight that could be added by eg sending the inputs over mock mouse/keyboard over usb instead of through libTAS since you're then validating lag behavior of the OS layer much more than the hardware. So it kinda shows how our concept of verification is tightly tied to these earlier bare metal systems without an OS layer or ones where the OS is very meager and without any/many revisions.
Patashu
He/Him
Joined: 10/2/2005
Posts: 4043
There's a 39 day SM64 TAS that's been verified, so that one is even longer: Link to video SM64 is actually one of the easiest N64 games to console verify, because its physics don't change if there are more or less lag frames, and it doesn't poll on lag frames, so if you just give it input on non-lag frames you're good to go. There are much harder N64 games to console verify, because we can't even emulate them accurately, because to emulate their lag requires cycle accuracy and their lag changes physics. Goldeneye being a good example of this (at least the last time I checked) SNES is also very hard to console verify, IIRC the reason is because two different components have their own timers and in any actual SNES their timers are slightly desynced due to physical inaccuracies.
My Chiptune music, made in Famitracker: http://soundcloud.com/patashu My twitch. I stream mostly shmups & rhythm games http://twitch.tv/patashu My youtube, again shmups and rhythm games and misc stuff: http://youtube.com/user/patashu
Emulator Coder, Judge, Experienced player (729)
Joined: 2/26/2020
Posts: 783
Location: California
Patashu wrote:
There are much harder N64 games to console verify, because we can't even emulate them accurately, because to emulate their lag requires cycle accuracy and their lag changes physics. Goldeneye being a good example of this (at least the last time I checked)
Even if you have some perfect accuracy, you still end up screwed due to non-deterministic behavior. You have the bootup time of the game being randomized with a hardware RNG (thanks Nintendo), and then you have the clock relationships between the RCP (i.e. what controls many different components of the N64) and CPU being dictated by a PLL clock multiplier (which ends up having a similar in concept issue with SNES, although it's still one clock, just that clock being multiplied cannot have the multiplication done deterministically, so you get a issue similar with having 2 separate clocks).
Bigbass
He/Him
Moderator
Joined: 2/2/2021
Posts: 193
Location: Midwest
N64:
TiKevin83 wrote:
It doesn't do as much generally for the N64 but rcombs' 15 hour Watch for Rolling Rocks .5 A Press verification is the longest I know of which is an achievement itself for the stability of the tooling.
Patashu beat me to it, but yeah, that was definitely not the longest. rcombs also successfully replayed a TAS of the pendulum crash bug in SM64, which was a little more than 39days long. Which is certainly impressive, and I recall there being some failed attempts beforehand due to interrupted power or something along those lines. However, SM64 isn't a challenging game to replay on hardware (timing inaccuracies in emulators don't affect replayability.)
Patashu wrote:
because its physics don't change if there are more or less lag frames, and it doesn't poll on lag frames, so if you just give it input on non-lag frames you're good to go.
Technical Explanation: The game was programmed to poll for inputs one time per game loop, regardless of how long it took to complete each iteration. N64 games typically used a double or triple buffering scheme. The video output hardware is more or less automatic, with some level of configuration. It's either on, and outputting a framebuffer given its position in memory, or it's off completely. Games will render (rasterize) into one framebuffer, while the video hardware is outputting the other framebuffer. When the rendering is complete, games will typically wait for VSYNC, and then swap the buffers. "Lag frames" are really just the frames in which the game wasn't ready to swap the buffers yet, and so the video hardware displayed the previous framebuffer again. The difference between SM64 and many other games, is that most games will poll controllers automatically every video frame (e.g. typically around VSYNC), and store the data in memory for the game to use whenever it wants. Replay devices aren't capable of knowing which poll actually gets used, nor when it should increment to the next input. SM64 polls once per game loop. So regardless of how long it takes to complete one iteration of logic and rendering, the game will poll only once, making TAS verifications extremely easy. Emulators don't have to be highly accurate.
Ultimately, not enough experimentation has been performed on the N64 to confidently say what would be the "capstone" of the system in regards to TAS verification. The most immediate issue continues to be inaccurate emulation. Sure, it's getting better, particularly with Ares, but there are still issues with emulating accurate timing which is critical for verifying many games on this system. However, even if Ares has reached a sufficiently accurate point for some games, we wouldn't know because extremely little testing has been done to understand what issues may exist or what games may be viable for verification.
NES: I agree that Bad Apple really pushes the replay hardware in a variety of ways. However, I feel there are still some edge cases that don't come up in Bad Apple:
TAS Verifications | Mastodon | Github | Discord: @bigbass
Alyosha
He/Him
Editor, Emulator Coder, Expert player (3822)
Joined: 11/30/2014
Posts: 2832
Location: US
TiKevin83 wrote:
I would also argue the whole PC TASing scene is quite interesting as to how it questions the very idea of what a verification is - usually it applies only to a single SKU of a system, eg the GBA/GBP variant of the Game Boy Color, but my Backyard Baseball TAS for example is replicable across systems of wildly different hardware capabilities. There's not really any interesting accuracy insight that could be added by eg sending the inputs over mock mouse/keyboard over usb instead of through libTAS since you're then validating lag behavior of the OS layer much more than the hardware. So it kinda shows how our concept of verification is tightly tied to these earlier bare metal systems without an OS layer or ones where the OS is very meager and without any/many revisions.
It would be cool if more modern PC games offered built in TAS / replay tools, but its probably too niche for devs to bother.
Patashu wrote:
There are much harder N64 games to console verify, because we can't even emulate them accurately, because to emulate their lag requires cycle accuracy and their lag changes physics. Goldeneye being a good example of this (at least the last time I checked) SNES is also very hard to console verify, IIRC the reason is because two different components have their own timers and in any actual SNES their timers are slightly desynced due to physical inaccuracies.
CasualPokePlayer wrote:
Even if you have some perfect accuracy, you still end up screwed due to non-deterministic behavior. You have the bootup time of the game being randomized with a hardware RNG (thanks Nintendo), and then you have the clock relationships between the RCP (i.e. what controls many different components of the N64) and CPU being dictated by a PLL clock multiplier (which ends up having a similar in concept issue with SNES, although it's still one clock, just that clock being multiplied cannot have the multiplication done deterministically, so you get a issue similar with having 2 separate clocks).
My opinion on this is that either a hardware mod is needed, or some other new form of TAS is necessary for these systems. Maybe something like a scripting language + machine vision can account for non-determinism in real time? At least for SNES I have personally seen that verification is just generally not possible with the same pipeline as NES.
Bigbass wrote:
NES: [*]There's [2137] NES Spy vs. Spy by Sir VG in 08:46.52 which continuously polls for input. The current publication may not even be verifiable, idk. ViGrey and I both struggled to make sense of how to dump, let alone replay, inputs for that game. [/list]
Oh this is a very interesting case that I was not aware of (or maybe just forgot.) You really never know which games will be troublesome until you look. Of course accounting for everything that a console / game could do becomes a bottomless pit of time and effort, so maybe picking 'capstones' from amongst games that do 'reasonable' things only could tame the problem a bit. This discussion reminds me how young a technology console verification still is.
Bigbass
He/Him
Moderator
Joined: 2/2/2021
Posts: 193
Location: Midwest
Alyosha wrote:
Patashu wrote:
SNES is also very hard to console verify, IIRC the reason is because two different components have their own timers and in any actual SNES their timers are slightly desynced due to physical inaccuracies.
My opinion on this is that either a hardware mod is needed, or some other new form of TAS is necessary for these systems. Maybe something like a scripting language + machine vision can account for non-determinism in real time? At least for SNES I have personally seen that verification is just generally not possible with the same pipeline as NES.
SNES might become a lot more possible in the near future. rasteri has created a hardware mod (open source) that essentially synchronizes the APU (audio) clock to the CPU clock, which should improve how deterministic the console is. But more testing is needed.
Alyosha wrote:
This discussion reminds me how young a technology console verification still is.
Definitely! There's been amazing progress in the past few years, across many different systems. Yet, from my experience researching and testing, it's clear there are still many unknowns. But I think that's okay! It means there is much more to explore and learn.
TAS Verifications | Mastodon | Github | Discord: @bigbass
Dimon12321
He/Him
Editor, Reviewer, Experienced player (596)
Joined: 4/5/2014
Posts: 1222
Location: Romania
Bigbass wrote:
SNES might become a lot more possible in the near future. rasteri has created a hardware mod (open source) that essentially synchronizes the APU (audio) clock to the CPU clock, which should improve how deterministic the console is. But more testing is needed.
Does it match an emulator configuration or should the emulator be adjusted to that?
Bigbass wrote:
There's been amazing progress in the past few years, across many different systems. Yet, from my experience researching and testing, it's clear there are still many unknowns. But I think that's okay! It means there is much more to explore and learn.
Like Sega Genesis, I suppose. I see, all verified movies were done years ago, and those were old Gens movies. The last year, I asked in dwangoAC's Discord channel about verifying Genesis movies and someone responded that not much attempts were initiated to do that, so maybe there is another platform just waiting for its time.
TASing is like making a film: only the best takes are shown in the final movie.
Alyosha
He/Him
Editor, Emulator Coder, Expert player (3822)
Joined: 11/30/2014
Posts: 2832
Location: US
Bigbass wrote:
SNES might become a lot more possible in the near future. rasteri has created a hardware mod (open source) that essentially synchronizes the APU (audio) clock to the CPU clock, which should improve how deterministic the console is. But more testing is needed.
Cool project! I'll definitely mod my SNES when its finished.
Bigbass
He/Him
Moderator
Joined: 2/2/2021
Posts: 193
Location: Midwest
Dimon12321 wrote:
Does it match an emulator configuration or should the emulator be adjusted to that?
I don't know enough about it to say one way or another. I'd expect changing the emulator to match would be easier, but might invalidate existing movies. On the other hand, I don't know how flexible the mod is, and I don't know if emulators are even accurate to begin with.
Dimon12321 wrote:
Like Sega Genesis, I suppose. I see, all verified movies were done years ago, and those were old Gens movies. The last year, I asked in dwangoAC's Discord channel about verifying Genesis movies and someone responded that not much attempts were initiated to do that, so maybe there is another platform just waiting for its time.
Actually I verified [570] Genesis Wonder Boy in Monster World by Aqfaq in 42:10.45 last year, despite it having been published in 2006. Though, it required me modifying the Lua API in Gens to expose necessary information. It was the first Genesis verification since 2014. I feel that there could be a lot of existing Genesis movies that are verifiable, they just need to be attempted. Without a flashcart though, I'm unable to do anything more on that console. While I do have a prototype design for my own flashcart, it's not functional yet.
TAS Verifications | Mastodon | Github | Discord: @bigbass
YoshiRulz
Any
Editor, Emulator Coder
Joined: 8/30/2020
Posts: 106
Location: Sydney, Australia
Bigbass wrote:
Dimon12321 wrote:
Bigbass wrote:
SNES might become a lot more possible in the near future. rasteri has created a hardware mod (open source) that essentially synchronizes the APU (audio) clock to the CPU clock, which should improve how deterministic the console is. But more testing is needed.
Does it match an emulator configuration or should the emulator be adjusted to that?
I don't know enough about it to say one way or another. I'd expect changing the emulator to match would be easier, but might invalidate existing movies. On the other hand, I don't know how flexible the mod is, and I don't know if emulators are even accurate to begin with.
The consensus in #tasbot-dev was to semi-arbitrarily pick a ratio which is close to what the original specsheet had, and promoting that as a universal standard (including in RTA circles).
I contribute to BizHawk as Linux/cross-platform lead, testing and automation lead, and UI designer. This year, I'm experimenting with streaming BizHawk development on Twitch. nope Links to find me elsewhere and to some of my side projects are on my personal site. I will respond on Discord faster than to PMs on this site.
Hey look buddy, I'm an engineer. That means I solve problems. Not problems like "What is software," because that would fall within the purview of your conundrums of philosophy. I solve practical problems. For instance, how am I gonna stop some high-wattage thread-ripping monster of a CPU dead in its tracks? The answer: use code. And if that don't work? Use more code.