This was quite a bit of fun. I ended up writing a C++11 program for demonstration, and source for it can be
found on my blog (off-site to save space on this post). I wouldn't read it until you read this post, though. But as such, I omitted the math formulas from the post and instead provided links to the wiki pages describing it, as my code covers it all anyway (obviously).
Anyway, I'm mostly dealing with issue #1 in this post. I have ideas about #2 as well, but they are a bit dependent of the outcome of
my other post from the tier proposal thread. So to avoid discussing things that may not turn out, I'll wait to post that stuff. Plus this is already quite lengthy.
So the problem is that a movie's ranking is, ultimately, an average of all the votes. (I'm going to keep referring to "a movie's ranking", but we all know there's actually two ratings: the entertainment rating and the tech quality rating. For now, just assume one; we can figure out how to combine the two into one later.)
As noted by Baxter, an average of 9 from a single vote is a lot less meaningful than an average of 9 from a million. Luckily this is a "solved" problem, and what we need are
confidence intervals. What I'm going to describe is effectively an analog to the article
"How Not To Sort By Average Rating", an idea made popular when
Randall of xkcd pushed hard for it be to the sorting method used by comments on Reddit. We cannot use the algorithm directly because our votes are two scales from 0-100, rather than a single yes-or-no vote, but we can still use the confidence interval idea.
So to start, let's describe our overall goal. What we're really trying to calculate is the 'true'
mean (average). That is, if
every single possible viewer voted, we could take the mean and with 100% certainty say "this is the mean", because there couldn't possibly be another vote to throw it off. The problem is that not every possible voter casts a vote, so we have some uncertainty.
Warning: This post is about to get a bit mathy, skip to the text graphs to get a more intuitive idea if you don't care and just want pretty pictures.
We're going to calculate the 95% confidence interval of the mean vote. Basically, a 95% confidence interval says this: "there's a 5% chance that the 'true' mean lies outside of this calculated interval, but otherwise we're 95% sure it'll end up in here." I chose 95% because it's exceedingly common, and for our sample size (often not too large), asking for 99% or higher just ends up including most of the possible voting range, making the interval rather uninformative. The important part is this:
as more votes are cast, the size of the confidence interval becomes smaller, because the confidence in the sample data's accuracy is higher. We'll see how to turn this nice feature into a final score later.
To calculate a confidence interval, we have to make an assumption about the
distribution of votes. For those that don't know, a distribution specifies a probability to each possible outcome of a random experiment. Height is a very common example, and it can be modeled with a
normal distribution. This means that there is an average height, where most people's height is, then it tailors off as you leave this average height (see picture on the normal distribution wiki page).
The normal distribution is
very common across many measurements, and it's no different for votes. We can see individual votes as "guesses" to the true mean: there's going to be a single concentration of votes around an average, with the frequency of more deviant votes lowering as they become more extreme. Luckily for us, calculating the confidence interval from a normal distribution is easy, as is calculating the parameters for a normal distribution from sample data (votes).
Here's an example to make this concrete (all text graphs generated by the aforementioned C++11 program, with 1 million samples). These are the actual votes for Super Mario 64 (all votes are, for the duration of these examples, only from the publically-visible Entertainment column):
SM64
| # | 4
| # | 3.9
| # | 3.8
| # | 3.7
| # | 3.6
| # | 3.5
| # | 3.4
| # | 3.3
| # | 3.2
| # | 3.1
| # | 3
| # | 2.9
| # | 2.8
| # | 2.7
| # | 2.6
| # | 2.5
| # | 2.4
| # | 2.3
| # | 2.2
| # | 2.1
| # # # # ##| 2
| # # # # ##| 1.9
| # # # # ##| 1.8
| # # # # ##| 1.7
| # # # # ##| 1.6
| # # # # ##| 1.5
| # # # # ##| 1.4
| # # # # ##| 1.3
| # # # # ##| 1.2
| # # # # ##| 1.1
| # # # # # ## # # ##| 1
| # # # # # ## # # ##| 0.9
| # # # # # ## # # ##| 0.8
| # # # # # ## # # ##| 0.7
| # # # # # ## # # ##| 0.6
| # # # # # ## # # ##| 0.5
| # # # # # ## # # ##| 0.4
| # # # # # ## # # ##| 0.3
| # # # # # ## # # ##| 0.2
| # # # # # ## # # ##| 0.1
|#####################################################################################################| 0
+-----------------------------------------------------------------------------------------------------+------
| | | | | | | | | | | | | | | | | | | | |
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0
We can calculate the parameters for a normal distribution of this with just a few values. First, we need the mean. This is easy and already done on the site. Second, we need the
variance and
deviation (the deviation is just the square root of variance). The variance has more than one way of calculation, but I chose a
fairly simple bias-corrected method.
Next comes the critical bit. We calculate the
standard error. This is the deviation divided by the square root of the sample count (note: this is where sample count starts to come into play!). Luckily for us,
because our voting population is fairly small, sometimes we get a fairly significant (>5%) sample size! Assuming a voting population of 100†, it only takes 5 votes to meet this criteria. This means it's worthwhile to factor in
finite population correction (FPC). What this does is account for the fact that as the sample count (vote count) nears the total population size (number of voters), our confidence increases towards 100%. Once it reaches 100%, we no longer have a sample but a census, and the sample mean
is the true mean. We just multiply our initial standard error by FPC to get the true error. (This is something we're lucky to be able to take advantage of. Consider a site like Reddit where only a tiny fraction of the entire user base will vote on a comment.)
Last, we need the z-score for the 97.5 percentile point of a normal distribution. (This is the number of standard deviations away from the mean that 95% of the values lie). It's not trivial to calculate, but it's constant and the value is approximately
1.959963984540.
So now we can calculate our 95% confidence interval. Take the mean vote score and subtract from it the error multiplied by the quantile to get the lower bound, and instead add this product to the mean to get the upper bound. Now since we know the interval can never go below 0 or above 10, clamp it if necessary. Ta-da! We have our interval. This interval has a 95% chance of containing the true value (though there is a 5% chance our sample mislead us!).
Note this has the desired property of being dependent on the number of votes cast.
From the SM64 votes above, the resulting normal distribution is thus, where #'s indicate votes within the confidence interval and *'s indicate votes outside the interval:
SM64 Normal Distribution
| # | 45179
| ##### | 44049.5
| ##### | 42920
| ####### | 41790.6
| ######## | 40661.1
| ######### | 39531.6
| #########* | 38402.2
| *#########* | 37272.7
| *#########** | 36143.2
| **#########** | 35013.7
| **#########** | 33884.2
| **#########*** | 32754.8
| ***#########*** | 31625.3
| ***#########**** | 30495.8
| ***#########**** | 29366.4
| ****#########**** | 28236.9
| ****#########***** | 27107.4
| *****#########***** | 25977.9
| *****#########***** | 24848.5
| *****#########****** | 23719
| ******#########****** | 22589.5
| ******#########****** | 21460
| ******#########*******| 20330.5
| *******#########*******| 19201.1
| *******#########*******| 18071.6
| ********#########*******| 16942.1
| ********#########*******| 15812.6
| *********#########*******| 14683.2
| *********#########*******| 13553.7
| *********#########*******| 12424.2
| **********#########*******| 11294.8
| ***********#########*******| 10165.3
| ***********#########*******| 9035.8
| ************#########*******| 7906.33
| *************#########*******| 6776.85
| *************#########*******| 5647.38
| **************#########*******| 4517.9
| ***************#########*******| 3388.42
| *****************#########*******| 2258.95
| *******************#########*******| 1129.48
|*************************************************************************************#########*******| 0
+-----------------------------------------------------------------------------------------------------+------
| | | | | | | | | | | | | | | | | | | | |
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0
Count: 19 Sum: 170.4 Mean: 8.96842 Variance: 0.877774
Confidence (95%): [8.58737, 9.34948] (Range: 0.76211)
Note the calculated numbers displayed under the graph. To demonstrate how the interval size changes depending on the sample size, here's the exact same calculation except only half of the SM64 votes are used:
SM64 (half sample) Normal Distribution
| # | 45084
| ### | 43956.9
| ###### | 42829.8
| ###### | 41702.7
| ######## | 40575.6
| ######### | 39448.5
| ########## | 38321.4
| ########### | 37194.3
| ############ | 36067.2
| ############ | 34940.1
| #############* | 33813
| #############* | 32685.9
| #############* | 31558.8
| *#############** | 30431.7
| *#############** | 29304.6
| **#############** | 28177.5
| **#############*** | 27050.4
| **#############*** | 25923.3
| ***#############**** | 24796.2
| ***#############**** | 23669.1
| ****#############**** | 22542
| ****#############*****| 21414.9
| ****#############*****| 20287.8
| *****#############*****| 19160.7
| *****#############*****| 18033.6
| ******#############*****| 16906.5
| ******#############*****| 15779.4
| ******#############*****| 14652.3
| *******#############*****| 13525.2
| *******#############*****| 12398.1
| ********#############*****| 11271
| ********#############*****| 10143.9
| *********#############*****| 9016.8
| **********#############*****| 7889.7
| **********#############*****| 6762.6
| ***********#############*****| 5635.5
| ************#############*****| 4508.4
| *************#############*****| 3381.3
| ***************#############*****| 2254.2
| *****************#############*****| 1127.1
|***********************************************************************************#############*****| 0
+-----------------------------------------------------------------------------------------------------+------
| | | | | | | | | | | | | | | | | | | | |
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0
Count: 9 Sum: 80.9 Mean: 8.98889 Variance: 0.891852
Confidence (95%): [8.39736, 9.58042] (Range: 1.18306)
As you can see, lower samples imply a larger range. The final step is to turn this interval into a single value. The linked articles on Reddit's comment ranking simply take the lower bound of the confidence interval, and this works extremely well. The reason is that fewer votes will bias the lower bound to a lower score, which solves our original problem: a few votes end up contributing less than many votes, even when the average is the same, because the confidence interval will be wider.
The justification for this is approach simple as well. The lower bound says: "I'm 95% certain your true mean won't be lower than this, but get more votes and I'll let you know! But until then, this is quite fair of a rank to get because I'm 5% sure I'm not overinflating your true mean; in the rare case it's wrong, you're welcome for the bonus." In practice low-voted movies won't get punished
that much, but enough to knock near or equal means away from each other. (At most a single point: 7->6, for example.)
Also, remember that because our population is finite, as the number of votes reaches the total number of voters, the error value tends towards zero. At some point, with every vote accounted for, the error is zero and the lower bound and the mean coincide, giving the true mean.
So for SM64, the final score would be (roughly, as I don't know the private votes): 8.58737, which is a difference of -0.381055 from the simple mean. Note that
every score on the site will go down slightly as the number of votes gets taken into account. This is okay: we only care about the relative ordering, and the number won't vary in practice that much at all. (Keep in mind that for ranking and calculations for problem #2 these need to store all the decimal places; truncated/rounding to one decimal for display is sensible.)
Usage-wise, just note that a single vote is not enough to calculate variance, and two votes can give a very meaningless answer unless the two votes happen to be close to each other. The site already requires three votes before it calculates a rating though, so that's good.
Turning two scores into one will need to be discussed after the tiers thing settles down. I think a movie in an "entertainment-based" category should get most of its score from the Entertainment rating, while a movie in a "technical-based" category should get most of its score from the Tech Quality rating. It might be worthwhile to investigate
interval arithmetic (something I'm not as familiar with) for this task.
And that's it. What essentially comes down to a few multiplications and a couple square roots gives us a very meaningful and theoretically justified score (not just ad hoc tweaking). Here are some more example plots and data:
SMB 3
| # | 7
| # | 6.825
| # | 6.65
| # | 6.475
| # | 6.3
| # | 6.125
| # | 5.95
| # | 5.775
| # | 5.6
| # | 5.425
| # | 5.25
| # | 5.075
| # | 4.9
| # | 4.725
| # | 4.55
| # | 4.375
| # | 4.2
| # | 4.025
| # # # | 3.85
| # # # | 3.675
| # # # | 3.5
| # # # | 3.325
| # # # | 3.15
| # # # #| 2.975
| # # # #| 2.8
| # # # #| 2.625
| # # # #| 2.45
| # # # #| 2.275
| # # # #| 2.1
| # ## # # ## #| 1.925
| # ## # # ## #| 1.75
| # ## # # ## #| 1.575
| # ## # # ## #| 1.4
| # ## # # ## #| 1.225
| # ## # # ## #| 1.05
| # # # # ## ### # ## ## #| 0.875
| # # # # ## ### # ## ## #| 0.7
| # # # # ## ### # ## ## #| 0.525
| # # # # ## ### # ## ## #| 0.35
| # # # # ## ### # ## ## #| 0.175
|#####################################################################################################| 0
+-----------------------------------------------------------------------------------------------------+------
| | | | | | | | | | | | | | | | | | | | |
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0
SMB 3 Normal Distribution
| # | 43343
| #### | 42259.4
| ###### | 41175.8
| *######* | 40092.3
| *######* | 39008.7
| **######** | 37925.1
| **######*** | 36841.5
| ***######*** | 35758
| ***######*** | 34674.4
| ***######**** | 33590.8
| ****######**** | 32507.2
| ****######***** | 31423.7
| *****######***** | 30340.1
| *****######***** | 29256.5
| *****######****** | 28173
| ******######****** | 27089.4
| ******######******* | 26005.8
| ******######******* | 24922.2
| *******######******* | 23838.7
| *******######******** | 22755.1
| ********######******** | 21671.5
| ********######********* | 20587.9
| ********######********* | 19504.3
| *********######********* | 18420.8
| *********######**********| 17337.2
| **********######**********| 16253.6
| **********######**********| 15170
| **********######**********| 14086.5
| ***********######**********| 13002.9
| ***********######**********| 11919.3
| ************######**********| 10835.8
| ************######**********| 9752.17
| *************######**********| 8668.6
| **************######**********| 7585.03
| ***************######**********| 6501.45
| ***************######**********| 5417.88
| ****************######**********| 4334.3
| ******************######**********| 3250.72
| *******************######**********| 2167.15
| **********************######**********| 1083.58
|*************************************************************************************######**********| 0
+-----------------------------------------------------------------------------------------------------+------
| | | | | | | | | | | | | | | | | | | | |
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0
Count: 33 Sum: 291.3 Mean: 8.82727 Variance: 0.917633
Confidence (95%): [8.5584, 9.09614] (Range: 0.537744)
Final score as lower bound of confidence range: 8.5584
Current score on site: 8.82727
Difference: -0.268872
SMW
| # | 21
| # | 20.475
| # | 19.95
| # | 19.425
| # | 18.9
| # | 18.375
| # #| 17.85
| # #| 17.325
| # #| 16.8
| # #| 16.275
| # #| 15.75
| # #| 15.225
| # #| 14.7
| # #| 14.175
| # #| 13.65
| # #| 13.125
| # #| 12.6
| # #| 12.075
| # #| 11.55
| # #| 11.025
| # #| 10.5
| # #| 9.975
| # #| 9.45
| # #| 8.925
| # #| 8.4
| # #| 7.875
| # #| 7.35
| # #| 6.825
| # #| 6.3
| # #| 5.775
| # #| 5.25
| # # #| 4.725
| # # #| 4.2
| # # #| 3.675
| # # #| 3.15
| # # #| 2.625
| # # #| 2.1
| # # # # #| 1.575
| # # # # #| 1.05
| # # # # # # # # ### # # #| 0.525
|#####################################################################################################| 0
+-----------------------------------------------------------------------------------------------------+------
| | | | | | | | | | | | | | | | | | | | |
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0
SMW Normal Distribution
| # | 25105
| **#####* | 24477.4
| ***#####*** | 23849.8
| ****#####**** | 23222.1
| *****#####***** | 22594.5
| ******#####****** | 21966.9
| *******#####****** | 21339.2
| *******#####******* | 20711.6
| ********#####******** | 20084
| *********#####*********| 19456.4
| **********#####*********| 18828.8
| ***********#####*********| 18201.1
| ***********#####*********| 17573.5
| ************#####*********| 16945.9
| ************#####*********| 16318.2
| *************#####*********| 15690.6
| **************#####*********| 15063
| **************#####*********| 14435.4
| ***************#####*********| 13807.8
| ****************#####*********| 13180.1
| ****************#####*********| 12552.5
| *****************#####*********| 11924.9
| ******************#####*********| 11297.2
| ******************#####*********| 10669.6
| *******************#####*********| 10042
| ********************#####*********| 9414.38
| *********************#####*********| 8786.75
| *********************#####*********| 8159.12
| **********************#####*********| 7531.5
| ***********************#####*********| 6903.88
| ************************#####*********| 6276.25
| *************************#####*********| 5648.62
| **************************#####*********| 5021
| ***************************#####*********| 4393.38
| ****************************#####*********| 3765.75
| ******************************#####*********| 3138.12
| ********************************#####*********| 2510.5
| *********************************#####*********| 1882.87
| *************************************#####*********| 1255.25
| *****************************************#####*********| 627.625
|***************************************************************************************#####*********| 0
+-----------------------------------------------------------------------------------------------------+------
| | | | | | | | | | | | | | | | | | | | |
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0
Count: 57 Sum: 509.4 Mean: 8.93684 Variance: 1.58221
Confidence (95%): [8.72163, 9.15205] (Range: 0.430417)
Final score as lower bound of confidence range: 8.72163
Current score on site: 8.93684
Difference: -0.215208
Fortified Zone
| # # # | 1
| # # # | 0.975
| # # # | 0.95
| # # # | 0.925
| # # # | 0.9
| # # # | 0.875
| # # # | 0.85
| # # # | 0.825
| # # # | 0.8
| # # # | 0.775
| # # # | 0.75
| # # # | 0.725
| # # # | 0.7
| # # # | 0.675
| # # # | 0.65
| # # # | 0.625
| # # # | 0.6
| # # # | 0.575
| # # # | 0.55
| # # # | 0.525
| # # # | 0.5
| # # # | 0.475
| # # # | 0.45
| # # # | 0.425
| # # # | 0.4
| # # # | 0.375
| # # # | 0.35
| # # # | 0.325
| # # # | 0.3
| # # # | 0.275
| # # # | 0.25
| # # # | 0.225
| # # # | 0.2
| # # # | 0.175
| # # # | 0.15
| # # # | 0.125
| # # # | 0.1
| # # # | 0.075
| # # # | 0.05
| # # # | 0.025
|#####################################################################################################| 0
+-----------------------------------------------------------------------------------------------------+------
| | | | | | | | | | | | | | | | | | | | |
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0
Fortified Zone Normal Distribution
| # | 105660
| ## | 103018
| ## | 100377
| #### | 97735.5
| #### | 95094
| #### | 92452.5
| #### | 89811
| #### | 87169.5
| ###### | 84528
| ###### | 81886.5
| ###### | 79245
| ###### | 76603.5
| ###### | 73962
| ###### | 71320.5
| ######## | 68679
| ######## | 66037.5
| ######## | 63396
| ######## | 60754.5
| ######## | 58113
| ######## | 55471.5
| ######## | 52830
| ########## | 50188.5
| ########## | 47547
| ########## | 44905.5
| ########## | 42264
| ########## | 39622.5
| ########## | 36981
| ############ | 34339.5
| ############ | 31698
| ############ | 29056.5
| ############ | 26415
| ############ | 23773.5
| ############## | 21132
| ############## | 18490.5
| ############## | 15849
| *##############* | 13207.5
| *##############* | 10566
| **##############** | 7924.5
| **##############** | 5283
| ***##############*** | 2641.5
|******************************************************************##############*********************| 0
+-----------------------------------------------------------------------------------------------------+------
| | | | | | | | | | | | | | | | | | | | |
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0
Count: 3 Sum: 21.9 Mean: 7.3 Variance: 0.373333
Confidence (95%): [6.61561, 7.98439] (Range: 1.36878)
Final score as lower bound of confidence range: 6.61561
Current score on site: 7.3
Difference: -0.684391
Addams Family
| # # # # # | 1
| # # # # # | 0.975
| # # # # # | 0.95
| # # # # # | 0.925
| # # # # # | 0.9
| # # # # # | 0.875
| # # # # # | 0.85
| # # # # # | 0.825
| # # # # # | 0.8
| # # # # # | 0.775
| # # # # # | 0.75
| # # # # # | 0.725
| # # # # # | 0.7
| # # # # # | 0.675
| # # # # # | 0.65
| # # # # # | 0.625
| # # # # # | 0.6
| # # # # # | 0.575
| # # # # # | 0.55
| # # # # # | 0.525
| # # # # # | 0.5
| # # # # # | 0.475
| # # # # # | 0.45
| # # # # # | 0.425
| # # # # # | 0.4
| # # # # # | 0.375
| # # # # # | 0.35
| # # # # # | 0.325
| # # # # # | 0.3
| # # # # # | 0.275
| # # # # # | 0.25
| # # # # # | 0.225
| # # # # # | 0.2
| # # # # # | 0.175
| # # # # # | 0.15
| # # # # # | 0.125
| # # # # # | 0.1
| # # # # # | 0.075
| # # # # # | 0.05
| # # # # # | 0.025
|#####################################################################################################| 0
+-----------------------------------------------------------------------------------------------------+------
| | | | | | | | | | | | | | | | | | | | |
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0
Addams Family Normal Distribution
| # | 17243
| # ######### | 16837.9
| ############# | 16432.8
| ################# | 16027.6
| #################### | 15622.5
| ####################### | 15217.4
| ######################### | 14812.2
| *########################### | 14407.1
| *###########################* | 14002
| **###########################*** | 13596.9
| ***###########################**** | 13191.8
| ****###########################**** | 12786.6
| *****###########################****** | 12381.5
| ******###########################****** | 11976.4
| *******###########################******* | 11571.2
| ********###########################******** | 11166.1
| *********###########################********* | 10761
| **********###########################********* | 10355.9
| ***********###########################*********** | 9950.75
| ************###########################************ | 9545.62
| *************###########################************ | 9140.5
| **************###########################************** | 8735.38
| ***************###########################************** | 8330.25
| ****************###########################*************** | 7925.13
| *****************###########################**************** | 7520
| *****************###########################***************** | 7114.88
| ******************###########################****************** | 6709.75
| ********************###########################******************** | 6304.62
| ********************###########################******************** | 5899.5
| **********************###########################********************* | 5494.38
| ***********************###########################********************** | 5089.25
| ************************###########################************************ | 4684.12
| **************************###########################************************* | 4279
| ***************************###########################************************** | 3873.88
| ****************************###########################**************************** | 3468.75
| ******************************###########################***************************** | 3063.62
|********************************###########################******************************* | 2658.5
|********************************###########################********************************* | 2253.37
|********************************###########################*********************************** | 1848.25
|********************************###########################************************************** | 1443.13
|********************************###########################******************************************| 1038
+-----------------------------------------------------------------------------------------------------+------
| | | | | | | | | | | | | | | | | | | | |
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0
Count: 5 Sum: 22.7 Mean: 4.54 Variance: 2.32343
Confidence (95%): [3.2312, 5.8488] (Range: 2.61759)
Final score as lower bound of confidence range: 3.2312
Current score on site: 4.54
Difference: -1.3088
The program contains additional tests with manually constructed data.
Thanks for reading.
-----
†There could be some argument about what this finite voter population should be. It can either be the number of people who
have cast at least one vote (
are voters), or the number of people that
could cast a vote (
could be voters). Right now the
most rated movie has 89 votes, so for my tests I assumed (using the former metric) that the total voting population is 100. If we go with the former, to my knowledge that means the number of registered forum members is the count, nearer to 5030. With this metric, it takes 230 votes to reach 5%, and considering the number of lurkers there are this seems useless to me. Either way though, a new calculation must be done for each movie every time this population count increases, and I assume the former increases less often (thinking about server load now). Another implementation strategy is to have "gates". These are just multiples of 50 (for example), and each time a gate is reached the population count increases by 50 and the gate is increased by 50. This allows the FPC to near 1 on highly-rated movies, but avoids constant server load whenever the population increases due to happenstance.
(Huge final note: there are other and potentially better ways to do everything I described: maybe a credible interval would work better, or another distribution better fits the data. I think, though, that the chances that a normal distribution being insufficient are extremely small and not worth the computational effort required to move to a hypothetically better model. What I've presented is fairly easy to implement.)