Active player (310)
Joined: 8/21/2012
Posts: 429
Location: France
Kuwaga wrote:
Very nice. I still think that ultimately the best way to program a generalised TASing bot is to make it scientific. Forming and testing hypotheses about how memory addresses interact, how much of it it can control or influence, and under which circumstances. Very playfully at first, with the goal to learn as much as possible. Then finally tear the game apart. Of course this would be _really_ hard to achieve. A general rule to avoid deaths (and increasing the death counter) would have to be somewhat akin to it wanting to avoid RAM states that are very similar to previous ones without really getting any apparent new abilities, even if it increases the score or whatever. (I'll gladly accept the first bot to genuinely discover on its own how to acquire the hadouken in MMX as my new overlord) I swear it could be done if somebody was crazy enough to spend enough time on it. Lock somebody into a cell and don't leave them out until they're done. :p Tetris seems very difficult. And with that, I'll disappear back to whence I came.
One hard thing to do would be to have the bot know when the end of the game is reached. I'm talking about a bot aimed to work with any game of course. Some games don't even have a clear ending, and people are arguing about what truly means "completing" a game sometimes with special submissions here. But... Even with obvious endings (credits, a "the end" screen, etc...) I don't know how a bot can consider the game beat. I think the easiest solution would be to manually give a goal to the bot, depending of the game. And that would be something like "reach this value in this part of the memory" most of the time. Because I really can't think of a global way to tell the bot when to stop ^^
Joined: 7/2/2007
Posts: 3960
In many games it may be possible to recognize an end state if the bot encounters a memory state that exactly matches one it has seen before. That should never happen in normal gameplay, since score, lives, timers, etc. would all change. All this should require is for the credits sequence to reach a point at which they either stop or loop. Ideally any in-game timers (e.g. used for RNGs) would also be disabled in the credits, but ultimately the timers would loop anyway, so as long as you're willing to wait it shouldn't matter.
Pyrel - an open-source rewrite of the Angband roguelike game in Python.
Skilled player (1746)
Joined: 9/17/2009
Posts: 4988
Location: ̶C̶a̶n̶a̶d̶a̶ "Kanatah"
Grincevent wrote:
One hard thing to do would be to have the bot know when the end of the game is reached. I'm talking about a bot aimed to work with any game of course. Some games don't even have a clear ending, and people are arguing about what truly means "completing" a game sometimes with special submissions here. But... Even with obvious endings (credits, a "the end" screen, etc...) I don't know how a bot can consider the game beat. I think the easiest solution would be to manually give a goal to the bot, depending of the game. And that would be something like "reach this value in this part of the memory" most of the time. Because I really can't think of a global way to tell the bot when to stop ^^
If the bot can one day not only know where to end input, but also end early in style, that would be incredible.
Player (207)
Joined: 5/29/2004
Posts: 5712
Derakon wrote:
In many games it may be possible to recognize an end state if the bot encounters a memory state that exactly matches one it has seen before. That should never happen in normal gameplay, since score, lives, timers, etc. would all change. All this should require is for the credits sequence to reach a point at which they either stop or loop. Ideally any in-game timers (e.g. used for RNGs) would also be disabled in the credits, but ultimately the timers would loop anyway, so as long as you're willing to wait it shouldn't matter.
What if it's an ending that eventually takes the game back to the beginning? How would you tell that apart from a standard Game Over?
put yourself in my rocketpack if that poochie is one outrageous dude
Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
Derakon wrote:
but ultimately the timers would loop anyway, so as long as you're willing to wait it shouldn't matter.
It depends on the size of those timers. In typical NES games they are probably 16-bit so they loop over pretty quickly. However, if any game in some console used eg. 32-bit timers, and even if it incremented it on each frame (ie. 60 times per second), it would take over 2 years to loop over.
Skilled player (1746)
Joined: 9/17/2009
Posts: 4988
Location: ̶C̶a̶n̶a̶d̶a̶ "Kanatah"
Warp wrote:
Derakon wrote:
but ultimately the timers would loop anyway, so as long as you're willing to wait it shouldn't matter.
It depends on the size of those timers. In typical NES games they are probably 16-bit so they loop over pretty quickly. However, if any game in some console used eg. 32-bit timers, and even if it incremented it on each frame (ie. 60 times per second), it would take over 2 years to loop over.
According to Wiki, by the 5th generatin of consoles, there were already 64-bit consoles. Although I'm not sure why would a game need a timer that runs each frame for 264 − 1 integers.
Active player (310)
Joined: 8/21/2012
Posts: 429
Location: France
Bag of Magic Food wrote:
Derakon wrote:
In many games it may be possible to recognize an end state if the bot encounters a memory state that exactly matches one it has seen before. That should never happen in normal gameplay, since score, lives, timers, etc. would all change. All this should require is for the credits sequence to reach a point at which they either stop or loop. Ideally any in-game timers (e.g. used for RNGs) would also be disabled in the credits, but ultimately the timers would loop anyway, so as long as you're willing to wait it shouldn't matter.
What if it's an ending that eventually takes the game back to the beginning? How would you tell that apart from a standard Game Over?
That's one of the things I had in mind. It's really game dependant. In this case, the bot can't tell the difference between the end and a game over. To add to it, some games just give a game over screen after completing them ^^ (and I bet most of them use their normal "game over routine" at that point, resulting in a reset)
Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
I think that it would be acceptable, even for a "generic" bot, if it's given a specific condition that it has to reach in a game-by-game basis. In other words, it's told "in this particular game, when these memory addresses contain these values (iow. a game over state), you are done."
Skilled player (1746)
Joined: 9/17/2009
Posts: 4988
Location: ̶C̶a̶n̶a̶d̶a̶ "Kanatah"
Warp wrote:
I think that it would be acceptable, even for a "generic" bot, if it's given a specific condition that it has to reach in a game-by-game basis. In other words, it's told "in this particular game, when these memory addresses contain these values (iow. a game over state), you are done."
I hope this doesn't lead to those arguments similar to the ones in the SMW/Pokemon Yellow/Chrono Trigger/Earthbound/etc movies. :O
Joined: 9/27/2011
Posts: 207
Location: Finland
Be that as it may, that was pretty much how my children will evolve in beating video games.
Limne
Any
Joined: 2/24/2010
Posts: 153
I wonder if the objective function couldn't be refined... For instance, making it so that bytes that tend to decrease in lexicographic ordering are scored well, and not just the ones that tend to increase. I also wonder if the objective function could be refined during actual gameplay. For instance, let's suppose the bot was put in a Skinner box: An observer watches it play and during sequences where the bot does badly, the observer pushes a "punish" button, and when the bot does well, the observer pushes a "reward" button. Based on the frames that are either punished or rewarded, the objective function is reevaluated to score the significance in the change of memory. Ie. if lives going down is repeatedly punished, the weighting of lives as an objective should increase. If killing enemies is rewarded, then the changes in memory that signify this should have their weight towards the objective increased. To compare, the initial conditioning currently used to teach the bot how humans play basically already works like a "reward" segment. I'll admit I didn't fully understand everything talked about in the article, but this is what came to my mind upon reading it...
Joined: 1/26/2009
Posts: 558
Location: Canada - Québec
An interesting experiment that could came up, so the bots can try play a game with not so obvious objective(such as RPGs): Play multiple run as a human to get different input suggestion, by the value that get up and down and try to replay the game by these bots. Then when there's conflict between input(from those input suggestion), use the Twitch Play Pokemon system anarchy/democracy mode to accept the final input for a frame. How to let's the bot choose when to go in anarchy/democracy mode? Well, just random a 50% if it win, switching mode for a 5 minute of gameplay(or a random range of time for the sake of it) and loop random 50% again. Sadly, I don't think it would be easy to implement multiple input such as "up2right5" in democracy mode for the bots. This might take quite a while even with many playthrough as human, but I think the final result would be better than a "fully random input mode", despite all the bot loose track of what's going on, so we have to backtrack to an earlier savestate. edit: typo
Joined: 7/2/2007
Posts: 3960
That makes me wonder -- if you recorded multiple playthroughs of a game, could you use a kind of Markov Chain analysis to let a bot play the game? That is, on each frame the bot would examine the state of memory, and then the inputs the player provided on played frames with similar memory states, and would use that to decide what input to provide. Obviously the scope of "examine the state of memory" is pretty vast. I'm not that familiar with Markov Chains, so I don't know if this is remotely feasible. You'd probably need some way to set precedence on certain parts of memory (e.g. when the bot is in area A-4, ignore X/Y position memory from the human playthroughs that aren't from area A-4).
Pyrel - an open-source rewrite of the Angband roguelike game in Python.
Lex
Joined: 6/25/2007
Posts: 732
Location: Vancouver, British Columbia, Canada
Optimal programmatic solving of video game movement pathing has been done for Worms Armageddon battle race maps. BRSolver is a program which, partially by mapping the possible movements by trying input, finds the optimal (shortest time taken) path through a battle race map. By doing this, BRSolver discovered the flipwalking glitch and can generate an optimal WA replay file given any map with start and finish coordinate data. Here is an archive containing a normally very difficult Yoshi's Island themed battle race map by Wyvern and its BRSolver solution which includes: -the batch file used to run BRSolver which has start and finish coordinates, plus a couple map-specific flags -an image which traces the path taken by BRSolver through the possible movement space from start to finish -a human-readable log of the path taken by BRSolver -a Worms Armageddon replay file http://lex.clansfx.co.uk/worms/Wyvern_YoshisMercilessIsland_BRSolver.zip Due to the nature of offline competitions in the Worms Armageddon community, the developers of BRSolver (CyberShadow and Deadcode) have decided against releasing the program itself to the public, as it could possibly be used in some way to fabricate competition-winning replays, which would then have to be policed more closely; a great inconvenience for all involved. I've posted this as a possible inspiration for other similar programs yet to be written because I believe BRSolver to be a program which produces 100% optimal TASes, something that I have not otherwise seen achieved.