The thing that gives some hope is that, for practical purposes, you don't need THE optimal input that completes the game. You only need one that's good enough. So, you might come up with some models and approximations that might generate a good enough TAS much more efficiently than a human.
As an example, Deep Blue, that beat Kasparov in chess, was a supercomputer. Nowadays even smartphones, that process much fewer positions, can play better than it. The difference is that engines today are much better at selecting positions that matter and discarding useless ones, and so are much stronger.
Incidentally, that overlaps with a bit of my work. There are many physical models that don't have analytical solutions and you have to use computers to approximate the solution. Because these theories always involve some constants that you determine experimentally, there's some uncertainty in the theoretical prediction, and it doesn't make much sense to run the simulation at a higher precision than these values are known.
Many people confuse this and believe that a simulation that came out of a supercomputer is always better, because it calculates more. In practice these simulations use very general numerical algorithms (because they have to work for lots of systems) that converge extremely slowly. Sometimes they even have to make ugly approximations to make the method eventually stop, and the value they converge to is not really the theoretical prediction. Depending on the system you are working with, if you choose a model with care, you can get a more accurate value even with pencil and paper than a general numerical package running on a supercomputer.
Joined: 3/9/2004
Posts: 4588
Location: In his lab studying psychology to find new ways to torture TASers and forumers
Ironically, MrWint's approach actually uses less simulation.
Any emulator you use must simulate the hardware of the NES, which will automatically add considerable overhead. MrWint's approach ports the key aspects of the game that he knows about to a computer language and runs more natively on the system than a simulated NES. This is a tremendous savings onto itself.
If MrWint would add to his software graphics and sound processing, along with a few other missing pieces, you'd actually have a fairly faithful port of SMB to your computer.
Essentially, MrWint's approach removes all the emulator overhead, plus unnecessary overhead the game has in order to make it presentable to a human and fun to play. However, that graphics, sound and other unnecessary stuff you point out still outweighs the overhead added by just being an emulator in the first place.
Warning: Opinions expressed by Nach or others in this post do not necessarily reflect the views, opinions, or position of Nach himself on the matter(s) being discussed therein.
And if you try all the possible permutations in the example I gave, it would take at least 77 years to go through. Conclusion: It's impossible to build a comprehensive list of words in a feasible time?
I think you are missing my point.
Difference between chess and most of games: chess has well defined simple rules.
Your example completely irrelevant.
In SMB / Sonic for example, no such thing as dictionary.
You can finish game in infinite amount of ways.
If you define that list as all inputs with minimum time to finish...
Warp wrote:
The trick is that you don't need to go through all the possible permutations of those 20 letters. When creating permutations, when you have taken two letters, you can check if any words start with them, and if they don't, you can skip all permutations that begin with those two letters. (The same goes of course for permutations of the first three letters, and so on.) This reduces the number of permutations to check to a microscopic fraction of the total.
Then you can't say for any prefix: can you finish game in minimum time or not?
If you would know this, you'll be able to make shortest input movie.
But this is because you know something, which is called "knowledge".
And this case perfectly fits my explanations.
Your example completely irrelevant.
In SMB / Sonic for example, no such thing as dictionary.
If you aren't willing to even try to understand what I'm saying, then fine.
No such thing as dictionary... sheesh. Way to miss the point by a country mile.
Sure, you can compare every memory state you get to all possible memory states, and disregard all the new input sequences that lead to some previously found memory state. But that doesn't tell anything about which memory state leads to absolute shortest movie. It's just not enough to check if the "word" you have is in the "dictionary" or not.
For example, let's say I have a savestate on frame 1000, I got there by executing certain inputs. It may be possible to reach this state sooner in the movie than in 1000 frames. Without testing all the possibilities up to that frame, I won't be able to prove that 1000 is the optimal frame for this state.
This is down to search algorithms. There are some search orders you can use that are guaranteed to find the shortest output first; the simplest is to try all the possible inputs in length order (except for ones which have a prefix that turned out to be slower than another way of doing the same thing). I think there may be games where this is simple enough that it's actually within the capabilities of a powerful modern computer.
I have some more complex algorithms in mind that would have the same properties but be more efficient. The problem is just finding time to getting around to implementing them.
Joined: 4/17/2010
Posts: 11475
Location: Lake Chargoggagoggmanchauggagoggchaubunagungamaugg
ais523 wrote:
This is down to search algorithms. There are some search orders you can use that are guaranteed to find the shortest output first; the simplest is to try all the possible inputs in length order (except for ones which have a prefix that turned out to be slower than another way of doing the same thing). I think there may be games where this is simple enough that it's actually within the capabilities of a powerful modern computer.
I have some more complex algorithms in mind that would have the same properties but be more efficient. The problem is just finding time to getting around to implementing them.
Yes, I understand that state that's been achieved previously in longer time can be disregarded, along with its input, if shorter input reaching it is found. And indeed I see that it helps to avoid truly exhaustive search. I just want to understand problems that raise after this method is applied, and how hard/easy they can be solved. Can you be more specific on what such an algo would have to do?
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
OK, so the basic idea is this: we want to find the shortest input file that completes the game, and in theory you could do that via trying all possible input files in length order until one of them completes it, but that's too slow. So we want to find some other algorithm that has the same effect as that one.
The most basic thing we can do here to speed things up is called "pruning": identifying input sequences that can't possibly be shortest (or at least, if they are shortest, must be tied for shortest with a sequence that hasn't been pruned), and not simulating them. For example, if we discover that holding down the Select button on the first frame of the game has no influence on the game's memory, we can prune all the sequences which hold Select on frame 1, as they'll necessarily tie with the equivalent sequence that doesn't use Select. This can be very easy to do in the simplest cases (e.g. via using a memory comparison of all of memory).
There are more complex examples of pruning, though. For example, if the memory states differ but the addresses that differ will necessarily be overwritten before they're read (say some memory address holds "buttons held on the previous frame" but it's only used in certain circumstances that don't apply here), or the memory states differ but can't affect the time at which the game is completed (as determined by, e.g., static analysis; this is very hard to do in general because almost anything can affect lag properties).
A second possible optimisation is reordering: changing the order in which we look at input sequences to look at more fruitful sequences first, in the hope that we'll find the optimal sequence earlier. The problem with reordering is in proving that the sequences that you've deferred until later aren't going to lead to a shorter outcome than the final game-completing sequence you end up with; so you'd need to use something like automated static analysis to produce hard constraints on what's possible in the game that a reordering algorithm could use. (However, there are algorithms like A* that can be used to produce a good reordering given pretty much any assumption about how the game can act, and are known to find the optimal answer first if the assumption is correct. So the only difficulty here, but it's a pretty big one, is to ensure that the assumption you've made is one that's valid in the game, which would have to be done via a combination of searching for any codepaths that violate it, plus a proof that ACE does not exist in the game. The latter isn't as ridiculous as it sounds; it's an important security property to have so tools for proving that already exist for a range of languages and platforms. If we had one for a common TAS platform, it'd both be helpful in finding optimal runs, and in finding new ACE tricks.)
Reordering can also be used to get results that are intermediate points towards having a provably perfect run. For example, suppose that instead of aiming for a minimum number of frames of input, we aim for a minimum number of non-lag frames between the start and end of the game, and try the input files in an order that provably optimises for that. This will likely be much easier to prove optimal (under the non-lag metric) than proving an input file optimal under the normal timing would be, as you can prune it much more aggressively. Once you're done, you have a proven fact about the game that you can use to guide future TASing; for example, if you know that the game takes a minimum of 19425 non-lag frames to beat, and you then produce a TAS that beats the game in 19425 frames total, you'll know that your TAS is optimal (and, as it happens, lag-free).
Finally, there's the concept of parallel simulation: you have two gamestates that genuinely are different, but which are similar enough that you can emulate them at the same time rather than having to emulate them separately. The simplest example of this is to imagine two input sequences that lead to memory states which are almost but not quite the same, and the memory addresses that differ are not read on the next frame (but may remain different at the end of the frame); we can determine this with registerread or the like. In this case, simulating one input sequence has given us the result of the other input sequence for free. The main disadvantage of the simple case here is that we'd need to somehow identify what input sequences can be grouped and how to make use of that for our big simulation. You can get more complex with this, e.g. tracking dependencies that are more complicated than an untouched address, or caching the emulation of blocks of code at a subframe level, but at that point you pretty much have to write the emulator from scratch to support that sort of operation.
For example, if we discover that holding down the Select button on the first frame of the game has no influence on the game's memory, we can prune all the sequences which hold Select on frame 1, as they'll necessarily tie with the equivalent sequence that doesn't use Select.
ais523 wrote:
There are more complex examples of pruning, though. For example, if the memory states differ but the addresses that differ will necessarily be overwritten before they're read (say some memory address holds "buttons held on the previous frame" but it's only used in certain circumstances that don't apply here)
Most of games are using input buttons data as:
1) read current joypad state
2) read from memory previous holded keys
3) compare them to find all new pressed buttons (key Down event)
4) store key down into dedicated variable
5) store current joypad state as "previous holded keys"
So: it's readed & overwritten straightforward.
Also, key down depends on previous frame.
So, character state should also have previous input in it.
Because, it matters.
If everything else is same, but previous input isn't, then it's different state,
because in one of them you can jump on right next frame (for example),
and in other state you have to wait one frame to release jump button, and press it back.
This potentially may lead to 16 "duplicates" of same state in some sense.
Where 16 is all different variants to hold Left Right Jump Action (some).
ais523 wrote:
However, there are algorithms like A* that can be used to produce a good reordering given pretty much any assumption about how the game can act, and are known to find the optimal answer first if the assumption is correct.
I want to dwell on this topic more.
First, few properties of A*:
1) It always finds an optimal path if heuristic is suitable.
2) This optimal path is always first that is reaching goal.
3) It postpond many shorter paths to later moments, and looking among longer paths earlier.
4) One of crucial property of space for A*: point in space never "changes".
To understand idea of A* you may imagine all possible paths to all points in space where we looking for path from start to goal.
All routes from this set is starting in "start" position, but it may end in any point.
Now, imagine you did sort all routes by its cost.
In our case, cost is time.
Then, if you try all routes in this order, you'll end up with eventually stumbing accross lowest cost path to the goal, which is same as shortest in time.
Dijkstra's algorithm is essentially same algorithm but with very optimized method for iterating over paths in such order (by non-decreasing cost).
A* adds following trick.
Imagine that for any point in space p, you know that you need at least h(p) cost (time).
This h is called "heuristic".
Now, recall why we are sorting paths by non-decreasing cost?
We do it because we want to try all paths with lowest cost (time),
then all paths with a bit bigger cost (time), and so on, in exhaustive manner.
All longer paths is some shorter path with additional step.
If we don't reach goal after walking this path, then we need some additional steps to reach the goal.
Thus, for each shorter path we can estimate how many additional steps we need, to reach our goal.
If we make our estimation that it will never be more optimistic than real situation, then h(p) where p is end point of path - is always less than real cost to travel from p to goal.
And cost of path with best addition is always greater than cost(path) + h(p).
Now, because it's always less than real best path, you can reorder paths by cost(path)+ h(p) instead.
This will scan optimistic paths first. But if it doesn't find solution earlier, it will anyway try all possible paths.
Here some comparison: https://www.youtube.com/watch?v=cSxnOm5aceA
2:01 such case where A* try almost all possible paths.
Other case is if you make vertical wall from top to bottom with only one hole in bottom. A* will try many cases.
This was explanation why (1), (2) is true. Also, it shows why and how (3) happen.
(4) is seen on picture itself. It seen that none of dots is filled twice.
For example. when it checks some cell in opposite direction from goal, it doesn't start to recheck all cells towards to the goal.
But in many games you may find advantage in moving backwards for some frames.
So, you have to recheck cell.
This is because cell is not state. State also has enemies and so on.
This means, that A* implementation for game, with following "map" of obstacles:
Here: black - walls, green - "first" chunk of search, and red - "second" chunks of search.
Because space points are changing (enemies, rng, and so on), after each advance in green chunk, you'll redo part of red chunk.
Remember, red part will be fully only at very end, when it'll try best path itself.
(confirmation: https://imgur.com/a/ou3RTVi)
Conclusion: with A* there are three problems:
1) states (points in space) is not consistent, you have to recheck them when you make change in earlier frames.
2) heuristic is not known, and always under suspicion
3) branching factor is big. (variety of inputs)
It's not full list.
Even if you make very good heuristic, these "rechecking" and branching factors will doom your performance.
Regarding other topics:
Good luck with proving that some game has no ACE or zips.
Lag frames: substituting one goal by another is good idea. Also you may change goal from shortest input, to shortest non-zip / non-ACE input ;)
ais523 wrote:
Finally, there's the concept of parallel simulation: you have two gamestates that genuinely are different, but which are similar enough that you can emulate them at the same time rather than having to emulate them separately.
Idk how you would do that. Are you planning to make emu with such feature? :)
Joined: 12/2/2011
Posts: 129
Location: Moscow, Russia
r57shell wrote:
1) states (points in space) is not consistent, you have to recheck them when you make change in earlier frames.
States don't have to be just positions. They can (and must in this case) include other factors that contribute to their inconsistency. In the most general case a state will contain the entire memory snapshot of the game.
It might be easier to think of this as of a graph where the nodes are consistent states, and the nodes are connected if, for example, there exists a combination of button presses that would lead from one node to another over one frame.
Of course for the search to finish quickly enough you must identify the subset of parameters that contribute to a state that's as minimal as possible, and find a heuristic function that's as close as possible to the real cost function (which is unknown). If you make a mistake and have inconsistent states or an over-estimating heuristic, well, the search will still work, however the "first found solution is the optimal one" guarantee will be lost.
UPDATE: to illustrate the point, suppose that to narrow down the search space you determined the number of lives of a character does not contribute to the state inconsistency, or, in other words, doesn't matter for the purposes of the search you're doing. Suppose you ran an A* search and got some path as the result. If the number of lives of a character indeed does not matter, this path that you've got would be the fastest possible. However if it turns out you can lose a life but complete the level faster as the result, the path found with A* wouldn't be the fastest. It's also possible for A* to try the combination of button presses that leads to a life loss during its search and find the fastest path anyway, but it's not guaranteed if the number of lives isn't included in the state.