Post subject: Multithread Lua
Player (71)
Joined: 8/24/2004
Posts: 2562
Location: Sweden
I was just wondering if it would be possible to use multithreading when it comes to Lua in our weapon of choice (i.e. emulators), or is there some limits when it comes to the emulator it self? Does the emulator it self need to have the capability of multithreading? I could guess that this would be beneficial when it comes to some bot stuff in the emulators as we can aggregate all the cores in the CPU. Found some info on the matter here, but I don't know what to make out of it. I just found it to be interesting and thought I'd share it with you guys. Cheers.
Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
Without any experience on Lua in emulators, I'd nevertheless guess that you could use multithreading between advancing frames, but you'd have to sync or end the multiple threads before advancing to the next frame because the emulator is not running multiple instances of itself in parallel. In other words, you could not, for example, try several key combinations in parallel. You could only make some calculations between frames in parallel, but no more.
Player (71)
Joined: 8/24/2004
Posts: 2562
Location: Sweden
Ok. So would it be smarter to have two or more instances of the emulator running, with a different random seed pattern (as two linear patterns would perform the same thing with the same result in both instances) for the bots, or would it just slow down the entire process all in all?
Post subject: Re: Multithread Lua
Emulator Coder, Skilled player (1114)
Joined: 5/1/2010
Posts: 1217
Highness wrote:
I was just wondering if it would be possible to use multithreading when it comes to Lua in our weapon of choice (i.e. emulators), or is there some limits when it comes to the emulator it self? Does the emulator it self need to have the capability of multithreading? I could guess that this would be beneficial when it comes to some bot stuff in the emulators as we can aggregate all the cores in the CPU.
Lua isn't designed to be able to execute multiple Lua functions at the same time in the same context. So that means that Lua code can only use one core. Still, one could run the lua code in its own thread (JPC-RR does this), so other emulation wouldn't eat CPU from Lua code. Unfortunately, this involves multithreading, which is rather complicated (the Lua VM in JPC-RR often locks up the emulator due to race conditions).
Skilled player (1652)
Joined: 11/15/2004
Posts: 2202
Location: Killjoy
Highness wrote:
Ok. So would it be smarter to have two or more instances of the emulator running, with a different random seed pattern (as two linear patterns would perform the same thing with the same result in both instances) for the bots, or would it just slow down the entire process all in all?
Even if it were lua in one thread, and the frame advance in another, they still need to happen sequentially to sync, as Warp said. Thus, you'd be much better off running two or more instances of the emulator. So, I looked up the random seeding function of lua, and got this ridiculous wall of text.
math.random , math.randomseed
math.random() generates pseudo-random numbers uniformly distributed. Supplying argument alters its behaviour:

    * math.random() with no arguments generates a real number between 0 and 1.
    * math.random(upper) generates integer numbers between 1 and upper.
    * math.random(lower, upper) generates integer numbers between lower and upper. 

    > = math.random()
    0.0012512588885159
    > = math.random()
    0.56358531449324
    > = math.random(100)
    20
    > = math.random(100)
    81
    > = math.random(70,80)
    76
    > = math.random(70,80)
    75

upper and lower must be integer. In other case Lua casts upper into an integer, sometimes giving math.floor(upper) and others math.ceil(upper), with unexpected results (the same for lower).

The math.randomseed() function sets a seed for the pseudo-random generator: Equal seeds produce equal sequences of numbers.

    > math.randomseed(1234)
    > = math.random(), math.random(), math.random()
    0.12414929654836        0.0065004425183874      0.3894466994232
    > math.randomseed(1234)
    > = math.random(), math.random(), math.random()
    0.12414929654836        0.0065004425183874      0.3894466994232

A good* 'seed' is os.time(), but wait a second before calling the function to obtain another sequence! To get nice random numbers use:

    math.randomseed( os.time() )

If Lua could get milliseconds from os.time() the init could be better done.

Nevertheless, in some cases we need a controlled sequence, like the obtained with a known seed.

But beware! The first random number you get is not really 'randomized' (at least in Windows 2K and OS X). To get better pseudo-random number just pop some random number before using them for real:

    -- Initialize the pseudo random number generator
    math.randomseed( os.time() )
    math.random(); math.random(); math.random()
    -- done. :-)

-- This not exactly true. The first random number is as good (or bad) as the second one and the others. The goodness of the generator depends on other things. To improve somewhat the built-in generator we can use a table in the form:

    -- improving the built-in pseudorandom generator
    do
       local oldrandom = math.random
       local randomtable
       math.random = function ()
          if randomtable == nil then
             randomtable = {}
             for i = 1, 97 do
                randomtable[i] = oldrandom()
             end
          end
          local x = oldrandom()
          local i = 1 + math.floor(97*x)
          x, randomtable[i] = randomtable[i], x
          return x
       end
    end

[4] : Why math.random() might give weird results on OSX and FreeBSD?

*...The problem seems to be that when the seeds differ very little the first value of generated by BSD rand() also differ very little. This difference is lost when Lua converts the integer returned by rand() into a real number, effectively preserving only the high bits in the result. When you call math.random(1,100) from Lua, the low-bit difference vanishes and you see the same integer result.

    -- improve seeding on these platforms by throwing away the high part of time, 
    -- then reversing the digits so the least significant part makes the biggest change
    -- NOTE this should not be considered a replacement for using a stronger random function
    -- ~ferrix
    math.randomseed( tonumber(tostring(os.time()):reverse():sub(1,6)) )

There is also lrandom[5] A library for generating random numbers based on the Mersenne Twister. 
Some people are so anal about their random numbers. I mean really, you never know if it is actually random!
Sage advice from a friend of Jim: So put your tinfoil hat back in the closet, open your eyes to the truth, and realize that the government is in fact causing austismal cancer with it's 9/11 fluoride vaccinations of your water supply.
Editor, Skilled player (1203)
Joined: 9/27/2008
Posts: 1085
As far as botting goes, you need a bare minimum of lua code that uses the same CPU as the primary emulator. Enough to know that, yes, the emulator is advancing a frame now. This code could handle strictly what is necessary -- Allow emulator to frame advance, read from memory, apply joypad buttons, whatever. The rest of the code, like the calculations or comparisons using that memory, or deciding a new list of joypad stuff, or a message to load the previous state, can be handled by the secondary code which you multithreaded onto another CPU. But that's my initial thought, anyway. I'm not sure how significant the benefit would be (simple bots won't have much lua-side code, moving this tiny code elsewhere won't improve performance significantly). If the emulator itself is already capable of multithreading (without desync issues), it's probably already making use of all the CPUs, and sticking the lua anywhere would only take from the emulator's processing. Still... Might be good to keep this in mind in case you have a long-term bot running pretty complex code. Smaller applications probably won't get a lot of use out of this, though. Another tool to have around, just in case we need the processing power.
Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
Highness wrote:
Ok. So would it be smarter to have two or more instances of the emulator running, with a different random seed pattern (as two linear patterns would perform the same thing with the same result in both instances) for the bots, or would it just slow down the entire process all in all?
If you have a multi-core CPU, then running several instances of the emulator, each running a lua script with different parameters (eg. searching for different paths) would obviously make testing all those paths faster, as each core gets to perform one of the searches. However, I'd guess it would be difficult to set up this system so that the separate emulator instances could communicate with each other (so that the optimal found route would win and end up in the final result, or for one of the emulators to tell the others "I found it, you can stop now"). If different emulator instances find different segments of the optimal path, it may be extremely difficult to merge them into one movie file. It could have some limited use, though. For example, if you are making an exhaustive search for an optimal enemy item drop, you could run the search in parallel like this and when you decide that that the shortest route found so far is short enough, see what keys were used and put them in your actual movie file, and then just continue TASing as normal.
Player (71)
Joined: 8/24/2004
Posts: 2562
Location: Sweden
Thanks for all the answers. Really interesting. Not sure if I'll ever be able to write my own bot, since it's a common fact that I'm on the top ten list over worst programmers ever. Heh.. Not even that possibly since I've never stated that I am a programmer at all, but anyway. If I was a programmer today, I would be on the top ten over worst programmers. :D I was probably going to be quite blunt on how to make the emulators talk, or not talk to each other. It never really crossed my mind that it could be beneficial. One way could be to let each of the emulators output the complete GMV file, and name it with a seed number, and have a tag that states "bad" or "good" along with the ammount of frames. For example "My_Game_(Seed_12345)_(FrameCount)_BAD.gmv". The BAD or good tag would be determined from a static value. For example. If you have beat a boss in 300 frames, you could check wether or not your GMV plays longer or shorter than 300 frames, and by that tag the GMV as good or bad. (Didn't Bisqwit make a variation of this for Rockman 2 some years back, or did I dream that?) Another way could be to let the Lua script read and write from/to a file which each emulator instance share so it knows what seeds to skip so that the emulators never run the same pattern twice. Hehe... Would be cool if we had cloud TASing. Distributed among the clients. Like SETI@Home and equivalents. Leave your emulator idle over night to help others crunch their TASes. In the best of worlds only I suppose. :)
Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
Highness wrote:
Another way could be to let the Lua script read and write from/to a file which each emulator instance share so it knows what seeds to skip so that the emulators never run the same pattern twice.
You'll probably run into mutual exclusion problems, unless lua supports file locking (so that when one process is writing to the file, the others are blocked out until it ends writing, so that none of the other processes would read incomplete or not-up-to-date data). If you can lock the file while reading/updating it, then it might be a viable solution, unless you need to do it very often in a very short period of time (in which case it would become a big overhead).
Former player
Joined: 2/19/2007
Posts: 424
Location: UK
The standard way of doing this would be by using MPI (message passing interface). There you basically have N copies of the same program running, and these can call MPI functions to find out how many siblings they have and which ID each has. They can also communicate by using MPI functions, both locally on the computer and transparently over the network. So if you had a supercomputer cluster available you could run thousands of emulators in parallel, with each exploring its part of parameter space, and have the winner broadcast a message to have the others stop. It is pretty simple to work with, but you would need to have MPI installed (OpenMPI for example), and would need to recompile the emulator and expose these functions to lua. So not totally trivial. But this is what we use in my field to solve problems with are intractable on a single computer. But even an increase in processing power of a factor of 1000 will not really help you get much further with a brute force search, so I am not sure going through all this effort would really be worth it. Improving your searching algorithm will probably help more.
Joined: 1/5/2012
Posts: 52
Location: Maridia
The only feasibility I can see here is if you have multiple cores and your bot needs to do a lot of number crunching on several different inputs to compute the best possible action - in this case you could spawn several threads to do the math using all available cores. Trying to get multiple threads to interface directly with the emulator - let alone while it's running - would just be hell.