Post subject: Tuning an emulator to run a LUA script as fast as possible?
TheCowness
He/Him
Joined: 6/1/2016
Posts: 3
tl;dr: I'm trying to run a LUA script that will (currently) take about seven years to finish. I'm looking for suggestions on how to get it to run faster. The script I'm running is on a scripted battle in Dragon Warrior III. Everyone muses about whether or not Ortega can defeat King Hydra, but I don't know of anyone having actually witnessed it without editing Hydra's HP to zero. Because of how the RNG works in the game, there are a finite 268,435,456 (65536*256*16) ways the fight can resolve, depending on the initial values of the three variables that affect RNG. I'm trying to test each possibility, but I'm currently only burning through one possibility a second (estimated completion date 2023). Anyway, I've got the speed on FCEUX turned up to 6400%, video size 1x, "Turbo" checked under "Emulation Speed", and on the "Timing" menu I've checked both boxes. Sound is disabled, and I unchecked everything under Config->Display. I also modified the rom itself in the hex editor and removed all of the text from the game, as well as a couple of the battle effects. I could do a little more, but I'm reaching the edge of my assembly-debugging abilities, and I doubt removing any more code could double my speed again. My script also forces the fight to the fastest message speed (the fight normally switches you to a slower speed) and mashes A. I tried running the script on Bizhawk this evening, but it looks like I can't get Bizhawk running nearly as fast as FCEUX. My script is pretty lean; I do have some GUI text I'm displaying every frame, but removing it doesn't seem to help (and I need -something- to display progress, to tell me which year the script will finish in). I'm not really sure what else to try at this point, other than going parallel and running multiple instances of the emulator on multiple machines, or coding my own battle simulator in something like C# and running that instead (which should be doable with the notes I have on the game). I've also considered seeing if I could modify the emulator itself (to remove...things? Maybe video-stuff? I don't need to see the fight.), since it's open-source. Are there any other settings (or emulators) I should check out? (In case you're curious, I've checked 960k possibilities so far and Ortega has never gotten King Hydra under 77 HP, which he did yesterday around attempt 850k. His old PB was 85 HP which he set around 32k... he's not PBing fast enough to convince me he can win.)
Emulator Coder, Site Developer, Former player
Joined: 11/6/2004
Posts: 833
Lua itself isn't all that CPU-hungry. Let's start with the obvious. How many frames is the time window that needs testing? Does it take a minute of in-game play to do the testing? Or alternatively, can you do the testing in another way? Can you just run the RNG and simulate the outputs yourself without needing the NES emulator in the way?
TheCowness
He/Him
Joined: 6/1/2016
Posts: 3
The fight is pretty long, but the changes I made to the ROM help. After the changes and manually setting the message speed, it's still about half a minute long in-game time. If I turn on the FPS counter, it looks like it's hanging at about 1500 FPS (25x speed) and each fight takes a second on average, so it's about 1500 frames long, give or take. There are some really detailed notes on how the RNG works in the entry for Vaxherd's TAS that was released earlier this year, so I can reproduce the random-generator easily. Reproducing the entire battle is much more complicated, but with his notes it should be doable. I had started programming such a simulator last week, but I only got past the step that determines their starting HP (enemies in DW3 spawn with 75-100% of their max HP). My plan is to follow through on that if I can't find a solution that requires much less effort.
Emulator Coder, Site Developer, Former player
Joined: 11/6/2004
Posts: 833
The only other suggestion I can give is parallel execution. You could run several instances of the emulator, each on a different CPU core, to run in parallel. 2 instances cuts the work time in half. It helps, but you're still looking at years of work.
Pokota
He/Him
Joined: 2/5/2014
Posts: 779
You might be surprised at result 123,456,789. How much of a CPU load does FCEUX take up when you're running your bot?
Adventures in Lua When did I get a vest?
Site Admin, Skilled player (1257)
Joined: 4/17/2010
Posts: 11541
Location: Lake Char­gogg­a­gogg­man­chaugg­a­gogg­chau­bun­a­gung­a­maugg
That's right, you can either hack off everything you don't need from fceux, or write your own simulator of the fight. Alternatively (an idea AnS posted a few years ago) - put your bot right inside fceux! Actual lua code -> C implementation -> communication to the emu - this pipieline doesn't have as much overhead as, say, lua to C# translation, it still has some. And yeah, go through all the functions it will do during your task and kill the extra overhead.
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
Joined: 7/2/2007
Posts: 3960
It sounds like your overhead is most likely in simply running the emulator as fast as possible, which is not something you're likely to be able to improve very much except by getting faster hardware to run it on. Thus, I'd suggest going the simulator route instead. A simulator should be able to run thousands of battles in a second with ease, since it only needs to care about what "actually happens" in the fight (i.e. resolving attacks, HP, etc.) as opposed to deciding what to draw, what sound effects to play, and all the other stuff that an emulator has to do even if you've muted the computer and turned the display off.
Pyrel - an open-source rewrite of the Angband roguelike game in Python.
TheCowness
He/Him
Joined: 6/1/2016
Posts: 3
It hovers around 50% CPU usage. It's on a machine that's only got a dual-core processor, so it's using an entire core. I have other computers I could get involved with this, but I don't know if I want them all running non-stop for a full year. Someone (jokingly) suggested to me setting up a bunch of VMs on AWS last week, but I'd rather not spend more than the cost of electricity. It's sounding like my best bet is just to program a custom battle simulator, so I'll keep that as Plan A. I just figured I'd gone far enough down this path that I should try reaching out to people who know more about using emulators to do more than just play the games.
Site Admin, Skilled player (1257)
Joined: 4/17/2010
Posts: 11541
Location: Lake Char­gogg­a­gogg­man­chaugg­a­gogg­chau­bun­a­gung­a­maugg
You can also statically recompile your game to native binary. Or only it's essential parts. http://andrewkelley.me/post/jamulator.html
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.