Back in the day when I first got the idea for the Movie Statistics page, the vast majority of submissions had an actual accurate rerecord count.
Nowadays, however, all the statistics about rerecords are useless. There are many submissions where rerecords do not tell anything useful because 99.9% of them are bot-generated, and in others the actual rerecord count has been either lost (often several times during the creation of the run) or, in a few cases, deliberately tampered with.
It would be nice if the movie files were somehow able to keep track of bot-generated rerecords and human-generated rerecords separately, but that's not the case (and most probably it's never going to happen.) Those could be interesting and could be used for useful statistics, but as it is now, the rerecord stats are mostly just garbage.
So what to do with the statistics related to rerecords? As much as it saddens me, I think I'll just have to remove them from the statistics page. As said, they serve little useful purpose anymore. However, I'm open to suggestions. (Note that currently I only have editor rights, though.)
(This isn't unprecedent. Back in the day download statistics were actually accurate and interesting. However, after repeated tracker resets, and the advent of the youtube and archive.org uploads, they became just completely non-descriptive, so I removed them. Their individual pages can still be accessed, but they aren't in the main statistics page anymore.)
Joined: 8/14/2009
Posts: 4089
Location: The Netherlands
There are only a handful of instances where bots are messing with the rerecord counts. Usually, movies with bot-rerecords get their rerecord counts wiped. There's just a few cases where this hasn't happened.
Also, I think it's kind of egoistical to say "I don't like these statistics, so I'm going to remove them". I, for one, still find the rerecord statistics quite interesting, regardless of the single movie using bots that skews the rerecord statistics a slight bit. (Which, yeah, does have to be corrected.)
http://www.youtube.com/Noxxa
<dwangoAC> This is a TAS (...). Not suitable for all audiences. May cause undesirable side-effects. May contain emulator abuse. Emulator may be abusive. This product contains glitches known to the state of California to cause egg defects.
<Masterjun> I'm just a guy arranging bits in a sequence which could potentially amuse other people looking at these bits
<adelikat> In Oregon Trail, I sacrificed my own family to save time. In Star trek, I killed helpless comrades in escape pods to save time. Here, I kill my allies to save time. I think I need help.
A single movie? I have hard time believing someone would rerecord over 2 million times by hand. Or even 860 thousand times. (If you rerecorded once per second, you would have to do that contiguously 24/7 for over 10 days in order to get 860 thousand rerecords.)
If those millions of rerecords are indeed all made by hand, then I stand corrected, and impressed.
On the other extreme we have 6 rerecords for a 30-minute run. Sorry if I'm a bit skeptical about this being the actual count.
Joined: 8/14/2009
Posts: 4089
Location: The Netherlands
I find your assumption that these runs used bots quite funny, because neither run has anything to indicate such.
The first run is SM64 120 stars - a notable team-up attempt between a large group of authors. Note the fact that the run has 15 authors. And because of the nature of SM64, it is possible for multiple authors to concurrently work on different stars, and splice the result together later. This, combined with the very high optimization level of SM64, can indeed pop up the rerecord count very easily. And no bots were used, because in SM64, there's hardly anything to bot for.
The second run is Yoshi's Island 100%, which is simply a continuous effort lasting about three and a half years, with three skilled and experienced TASers investing a lot of time in it during that timeframe. The end result is a highly optimized run that's nearly two hours long. It also had to redo many parts when improvements were found. Again, no botting because there isn't anything significant to bot for.
If you're going to challenge the validity of rerecord counts, at least make sure to do any sort of research about the runs in question. Reading a submission text isn't very hard. Reading the title and counting the authors is even less hard.
Not sure what's up with that, but still, it doesn't imply anywhere bots were used. That's probably one where rerecord count should be cleared, though. That's still less drastic than tossing the entire list of statistics away.
http://www.youtube.com/Noxxa
<dwangoAC> This is a TAS (...). Not suitable for all audiences. May cause undesirable side-effects. May contain emulator abuse. Emulator may be abusive. This product contains glitches known to the state of California to cause egg defects.
<Masterjun> I'm just a guy arranging bits in a sequence which could potentially amuse other people looking at these bits
<adelikat> In Oregon Trail, I sacrificed my own family to save time. In Star trek, I killed helpless comrades in escape pods to save time. Here, I kill my allies to save time. I think I need help.
Joined: 4/17/2010
Posts: 11475
Location: Lake Chargoggagoggmanchauggagoggchaubunagungamaugg
How many rerecords a run must have to be considered botted by Warp?
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
My original post was not only about bots. It was also about lost and tampered rerecord counts.
6 rerecords for a 30-minute movie sounds to me like the actual count has been lost along the way. The GoldenEye run having exactly 400000 rerecords sounds to me like it has been tampered with. (It would require quite strong evidence to convince me that the author just happened to finish with exactly that many rerecords.)
I remember one case where the author directly stated in the submission text that he changed the rerecord count because of personal amusement or something.
So it does happen.
Joined: 4/17/2010
Posts: 11475
Location: Lake Chargoggagoggmanchauggagoggchaubunagungamaugg
I got zero rerecords with TASing in TASEdit and set the count to 88800 before submitting, so what? Anyone can mess with it to make it look trustable. No stats is 100% accurate, which still doesn't hurt your ability to get a valuable impression of the subject.
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
Joined: 8/14/2009
Posts: 4089
Location: The Netherlands
Warp wrote:
My original post was not only about bots. It was also about lost and tampered rerecord counts.
True enough. It seemed like the primary point of the thread was bot rerecords, though.
Warp wrote:
6 rerecords for a 30-minute movie sounds to me like the actual count has been lost along the way.
Yep. That's why I said it should get its rerecord count wiped.
Warp wrote:
The GoldenEye run having exactly 400000 rerecords sounds to me like it has been tampered with. (It would require quite strong evidence to convince me that the author just happened to finish with exactly that many rerecords.)
This actually was addressed in the submission thread. You responded to it, even.
Wyster, in that submission thread wrote:
Actually i ended up with around 399xxx so yeah it's a bit faked but not alot. (Besides i have worked at times worked on several m64's so there's supposed to be several thousand more :P)
Yes, it may not be entirely valid, but does that really matter?
Even after finishing a TAS, I could mash the load-state buttons as often as I want to. If I'm a couple hundred rerecords away from a funny or round number, I'd probably do that. This does not necessarily make the whole rerecord count 'forged' or 'fake'. It's just 'only' a bit over 99% accurate.
Warp wrote:
I remember one case where the author directly stated in the submission text that he changed the rerecord count because of personal amusement or something.
I believe this happened with Super Metroid by Taco and Kriole, where it was done because the real rerecord count was unknown due to hexing and such. That one was later wiped. In other cases, if the rerecord count is clearly made up, it's usually wiped as well. (It's just that sometimes, these accuracies slip by and get into the statistics. That is a bit more of an issue, but still, most of the time this is hardly relevant).
http://www.youtube.com/Noxxa
<dwangoAC> This is a TAS (...). Not suitable for all audiences. May cause undesirable side-effects. May contain emulator abuse. Emulator may be abusive. This product contains glitches known to the state of California to cause egg defects.
<Masterjun> I'm just a guy arranging bits in a sequence which could potentially amuse other people looking at these bits
<adelikat> In Oregon Trail, I sacrificed my own family to save time. In Star trek, I killed helpless comrades in escape pods to save time. Here, I kill my allies to save time. I think I need help.
Joined: 4/17/2010
Posts: 11475
Location: Lake Chargoggagoggmanchauggagoggchaubunagungamaugg
I like how Warp didn't say anything about the bot examples he gave/got responded about.
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
I like how Warp didn't say anything about the bot examples he gave/got responded about.
Is this suddenly some kind of pissing contest about who outsmarts the other?
I presented something that I see as problematic about one of the pages in the website, and asked for opinions and suggestions. Somehow this thread turned into some kind of argument bordering a flamewar.
Joined: 4/17/2010
Posts: 11475
Location: Lake Chargoggagoggmanchauggagoggchaubunagungamaugg
What makes you think that what you see in this thread isn't our opinions on the matter?
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
I got zero rerecords with TASing in TASEdit and set the count to 88800 before submitting, so what? Anyone can mess with it to make it look trustable. No stats is 100% accurate, which still doesn't hurt your ability to get a valuable impression of the subject.
I'll like to bump this thread once again because out of curiosity I went to the statistics page after a long time, and the problem has gotten even worse.
The top five "most rerecords" runs have over a million rerecords. The top-100 runs have all over 100k rerecords. I don't know how many rerecords people manually make, but I bet they don't make 37 millions of them.
On the other end we have a 5-minute SMB run with 3 rerecords and a 30-minute Metal Max Returns with 6 rerecords. Clearly something has been lost along the way. (I have no idea about the rest, but eg. 61 rerecords for a 14-minute Alien Storm run sounds suspiciously low, but I have no idea how much it really requires.)
Especially the top of the list is rather irrelevant, as it's flooded by bot-generated rerecord counts. (I don't think this would be useful information even if we wanted to know how many times a bot retried, because it's not used consistently. Some people drop bot-generated rerecords completely, others only include the rerecords from a successful attempt, etc.)
Again, as sad as it makes me, I don't think the rerecords statistics serve any useful purpose anymore. The question is, what to do about it?
Joined: 8/14/2009
Posts: 4089
Location: The Netherlands
I have thought about it a few times, though never acted upon it. I still like the statistics even despite the top 5 or so being massively skewed due to bot usage or other things getting lost in the shuffle, so I wouldn't agree with removing the statistics entirely. The best way to fix it, in my opinion, is to zero out the rerecord counts from the errant publication files (so that they don't show up anymore with their erroneous statistics on the statistics page). I suppose that can still be done when I (or others) have time for it. If that sounds like a good solution, and others agree with it, then that seems like the way to go for me.
http://www.youtube.com/Noxxa
<dwangoAC> This is a TAS (...). Not suitable for all audiences. May cause undesirable side-effects. May contain emulator abuse. Emulator may be abusive. This product contains glitches known to the state of California to cause egg defects.
<Masterjun> I'm just a guy arranging bits in a sequence which could potentially amuse other people looking at these bits
<adelikat> In Oregon Trail, I sacrificed my own family to save time. In Star trek, I killed helpless comrades in escape pods to save time. Here, I kill my allies to save time. I think I need help.
Joined: 12/8/2012
Posts: 706
Location: Missouri, USA
Also, with the use of the rewind feature in Bizhawk, rerecord numbers aren't as much of an indicator as they were in the past. In my current TAS project, I have over 17K rerecords, yet likely used rewind to go back and fix mistakes just as much.
"But as it is written, Eye hath not seen, nor ear heard, neither have entered into the heart of man, the things which God hath prepared for them that love him." - 1 Corinthians 2:9
Zeroing the rerecord count for runs that have a clearly wrong number is an ok solution.
Perhaps in the submission procedure the submitter could be asked if the rerecord count is accurate (number of manual rerecords) or whether the actual number has been lost or inflated by a bot. (It could be noted that this is just for statistical purposes only and has no relevance on whether the submission will be published or not.)
Zeroing the rerecord count for runs that have a clearly wrong number is an ok solution.
Perhaps in the submission procedure the submitter could be asked if the rerecord count is accurate (number of manual rerecords) or whether the actual number has been lost or inflated by a bot. (It could be noted that this is just for statistical purposes only and has no relevance on whether the submission will be published or not.)
Not that I care that much, but the SM64 submission countains both bot-generated (lua scripts to manipulate RNG) and human-generated re-records. I wouldn't know exactly how many were "real" re-records though.
the SM64 submission countains both bot-generated (lua scripts to manipulate RNG)
How does that work?
I think zeroing rerecord counts of botted runs is dumb, you’re going from having information about something to removing that information and having nothing. If the statistics page is such a problem, do it the intelligent way and code in exceptions for the movie IDs that are botted so they aren’t taken into account.
Also I don’t know if this is relevant but the AWDS run I’m hoping to submit relatively soon will have 0 rerecords because it’s being TASed outside of the emulator and into a bot instead.
How rerecords changed in TASEditor in FCEUX? And, How rerecords changed in Tas Studio in BizHawk?
Is it incremented for each key edit? Or only for running after edition if it was edited?
I think if rerecords count incremented with each keys action in such tool it will generate high rerecords count. But, I don't mind, because it's hard to track rerecords. There is so many tools available. For example, what I should do when I'm mixing two different movies into one? :) Should I summarize them? (A+B)
You make a good point in the sense that 10 years ago the number rerecords was a very good measure or how much work was put into the TAS because rerecording was basically the only tool available (in addition to slowdown and frame advance).
However, as emulators have advanced and developed with more and more tools to aid TASing, rerecording has become more and more just one tool among many, so singling out rerecording from among all of them as some kind of measurement is getting less and less relevant or descriptive.
It's still something that's very accurate and easy to measure, though, so it has its value. It would just be nice if we could, as I suggested, separate the "amount of work" (ie. number of rerecords) done manually and done by a script. Both information would be interesting, but simply adding them up devalues their meaning (especially when compared to TASes that did not use rerecording scripts).