blackpenredpen does explain it, there are part 1-4 videos on the description of his main video.
To keep it short, the algebraic trick is valid, once some hypothesis is satisfied. What you actually prove by doing the substitution is:
If for some value of x, the infinite power tower converges to 2, then x is equal to sqrt(2).
and
If for some value of x, the infinite power tower converges to 3, then x is equal to cube root of 3.
It turns out that the conditional is true in the first sentence, but false in the second. Note that the substitution is useful, because it reduces the problem of calculating it to proving that it exists, which is much simpler. It's very common when solving these problems to neglect proving the existence of a value that makes the series converge, but it's necessary, as the example form blackpenredpen shows, you run into paradoxes if you neglect it.
If I understood correctly, what he's essentially saying is that
x^x^x^... = y, and x^y = y
are not exactly the same thing, in a similar way that
sqrt(x) = y, and x = y^2
are not exactly the same thing. They behave the same within a certain range, but not outside of it.
The "validity range" for the former is just a bit more complicated than for the latter.
I might as well answer this. (Am I learning category theory just to answer p4wn3r's questions?)
(a) Consider a groupoid on a single object A, so all isomorphisms are in Mor(A,A). Let f and g be isomorphisms in Mor(A,A). Then g○f is in Mor(A,A) by property of composition, so composition is closed. Composition is associative. idA is the identity morphism in Mor(A,A). f has an inverse isomorphism f-1 in Mor(A,A). So a groupoid on a single object A forms a group.
Likewise, a group is a groupoid on a single object A (elements of the group are isomorphisms of the groupoid, and the group operation is "composition").
(b) Take the category on two objects A,B with four isomorphisms: idA in Mor(A,A), idB in Mor(B,B), f in Mor(A,B), and f-1 in Mor(B,A). Then all the groupoid conditions are satisfied, but this groupoid is not a group (group theory definition): f○f does not exist in the groupoid so it is not closed under composition.
It's notable that 2 and 4 are both important values for sqrt(2) infinite power tower - 2 is the first time it crosses the line y = x where the derivative is less than zero, 4 is the second time but this time with derivative greater than zero. But I'm not sure if this is coincidence or not.
I am actually very interested in the topic of whether it's possible to get someone to learn something by means of handing out simple exercises before presenting the theory, so feel free to share your experience doing this.
Anyway, I never understood why some people are so avert to categories. As I think this exercise makes clear, they can be seen as generalization of groups where you drop the closure and existence of inverse conditions. What's so wrong about studying that? I can understand some of the frustration, though. I find it completely baffling how people can sometimes write hundreds of pages when the idea is ridiculously simple, every author has a different definition of everything, and it becomes pretty much impossible to follow what's going on.
An example of this is the proof of Fermat's last theorem. Only chapters 3 and 5 are actually the novel arguments. Chapter 4 is an interesting idea that he tried, but did not work out. Chapters 1 and 2 are mostly setup, which is already in an equivalent form in the literature (although, to be fair, not neatly organized as in the paper). He could have written the same paper by simply stating Conjecture 2.16 and including chapters 3 and 5, which would reduce its length considerably. (To be clear, I don't claim to understand that proof, I still haven't got to that point and have no idea what's happening in Chapters 3 and 5, but I do understand how the main result follows if he can prove Conjecture 2.16). I won't even comment on the proposed solution to the abc conjecture, where it seems that no mathematician can understand the "proof" at all.
The main result that really sold me into category theory is BRST quantization. The usual procedure to quantize the theories used in particle physics is to start with an ill-defined functional integral, where you put complex and Grassmann numbers in the integrand, and applying a lot of algebra to reduce it to a Gaussian one. For the gauge fields, you introduce a magical identity to get rid of some ambiguities. In some cases, where we have chiral fermions, this can fail because of anomalies.
In BRST quantization (I think Wikipedia's definition is different from mine, so I won't even cite it here), the space where all particles live is defined to be a chain complex. The BRST operator s defines a cohomology in this complex iff s²=0. In this procedure, you write it in two parts: s = a + b, where both a and b define cohomologies, that is, a²=0 and b²=0. One of them corresponds to the part of reducing all fields to Gaussian integrals, and is a Koszul-Tate resolution, while the other, which corresponds to inserting the magical identity, is simply Lie algebra cohomology. In order to have s²=0, we must have ab+ba=0, meaning that they anticommute. It turns out that they always anticommute if the theory does not have chiral fermions, but they can also anticommute "by accident" even if it has them, like the Standard Model does.
I believe no one will understand what I wrote, but trust me that I was completely blown away when category theory turned a procedure that I always thought inelegant and brute-forced, and turned it into very clean concepts :)
I assume that by "it" you mean the exponential function sqrt(2)^x being the function which you iterate to get the sqrt(2) power tower, and that you mean the derivative is less than 1 when x=2 and greater than 1 when x=4.
None of what you said is a coincidence. In fact, it explains exactly why the sqrt(2) tower power converges to 2, and not to 4. Here is the diagram for fixed-point iteration using sqrt(2)^x:
Notice that the orange iterations converge to x=2 and the purple ones diverge from x=4.
In general, if a function f(x) crosses the line y=x at a point x=a, iteration converges to that point if |f'(a)|<1, and diverges if |f'(a)|>1.
Anyway, here's an interesting Riddler Classic question. I copied it below as well:
----
You are studying a new strain of bacteria, Riddlerium classicum (or R. classicum, as the researchers call it). Each R. classicum bacterium will do one of two things: split into two copies of itself or die. There is an 80 percent chance of the former and a 20 percent chance of the latter.
If you start with a single R. classicum bacterium, what is the probability that it will lead to an everlasting colony (i.e., the colony will theoretically persist for an infinite amount of time)?
Extra credit: Suppose that, instead of 80 percent, each bacterium divides with probability p. Now what’s the probability that a single bacterium will lead to an everlasting colony?
----
Note: This question is a little deeper than just solving a quadratic. In fact, for p=0.5, you might get some interesting paradoxes.
Yeah, that's exactly what I mean - it finds the points of intersection, but not whether they converge or diverge - and finding that requires checking the derivative as well. Thanks!
I think something somewhat similar might have been asked before, but anyway.
There are videos out there where people toss a coin and get 10 of the same side consecutively (ie. either 10 heads, or 10 tails, consecutively). These videos are unedited and untampered. There is no trickery going on.
Of course the method is pretty simple: Just keep tossing a coin until you get 10 of the same side consecutively, and show that part only.
This got me thinking: What's the expected average number of times you'll need to toss the coin in order to get 10 of the same side consecutively?
If you repeated this process over and over (ie. keep tossing the coin until you get 10 consecutives, then start over; for the sake of simplicity we juts forget the previous tosses, ie. we don't count it as "the same" is you throw the same side an 11th time; we just count it as everything having been reset and we are starting over, so it's so far just 1 of that side having been tossed) a million times, what would be the average number of tosses?
The chances of 10 consecutive 50% chances is 0.5^10=0.0009765625. If you want the average amount of tries to get this chance, it's 1/0.0009765625=1024. This means you have to try 1024 times on avarage to get 10 consecutive cointosses heads. If you want either heads or tails, this gets halved to 512.
The next question is, how many cointosses does it take on avarage to get to the avarage 1024 tries to get 10 consecutive heads? Of course, not every try takes only one cointoss (if you care not whether you get 10 heads or 10 tails, every try takes at least two cointosses). If you care only for heads, these are the numbers:
1 cointoss until failure (=tails) -> 50% chance, so .5*1024=512 tries of one cointoss.
2 cointosses until failure -> 25% chance, so .25*1024=256 tries of two cointosses means 512 cointosses.
3 cointosses until failure -> 12.5% chance, so 128 tries of three means 384 cointosses.
4 -> 64 tries of 4 means 256 cointosses.
5 -> 32*5=160
6 -> 16*6=96
7 -> 8*7=56
8 -> 4*8=32
9 -> 2*9=18
10 (the succesful sequence) -> 1*10 = 10
adding all these up together gets us the avarage number of cointosses to get 10 consecutive heads. It is 2036.
You could do the same thing if you don't care about heads or tails. Since that is the same as flipping a coin to decide whether you want 9 consecutive heads or tails, we can use the method above to get to the answer. 512 tries to get to 9 consecutives, 1*.5^1*512+2*.5^2*512+etc up to 9*.5^9*512=1013 cointosses, plus the one cointoss to decide heads or tails, means the avarage number of cointosses to get 10 consecutive of either heads or tails and you don't care which is 1014.
If you want to know how many 10-consecutives you get on avarage in a million cointosses, it's 1000000/2036=~491 or 1000000/1014=~986
@Chanoyu, I don't think the answer is nearly that simple. I'll try to explain by solving it.
So the approach will be:
1. Determine the probability that you will have thrown at least 10 consecutive heads after N throws. This gives us a CDF of the probability distribution.
2. From that derive a PDF of the probability distribution.
3. From that use the expectation value formula to obtain an estimate of the mean number of throws.
Part 1: Finding the CDF.
Consider a toy example, 2 consecutive heads in 5 flips, we have the following cases (where X is don't cares):
In this case, C(f, h) is the Count of the ways it is possible to satisfy the requirement of h or more heads in f flips. P(f, h) is the probability of this, which is just C(f,h)/2f.
Toss. | Result
------|-------
HHXXX | True
THHXX | True
?THHX | True
??THH | True
Other | False
The ?s are because in order to avoid double counting cases we require that the ?s not result in our goal. Let C(f, h) represent the number of successful cases out of all the 2^f cases, and let ~C(f, h) = 2^f - C(f, h) representing the number of unsuccessful cases. In this case our toy example is:
C(5, 2) = 2^3 (from the Xs on line one) + ~C(0, 2) * 2^2 (line two) + ~C(1, 2) * 2 (line three) + ~C(2, 2) (line four)
This gives 19.
Let's consider what happens when we go up by one:
Toss.... | Result
---------|-------
HHXXX(X) | True
THHXX(X) | True
?THHX(X) | True
??THH(X) | True
(???THH) | True
Other... | False
So we have an extra X for each of the lines that were in C(5,2) and one more line.
C(6,2) = 2*C(5,2) + ~C(3,2)
From this we can deduce a method of increasing the number of flips by one:
C(f+1, h) = 2*C(f, h) + ~C(f-(h+1), h)
C(f, h) = 2C(f-1, h) + 2^{f-(h+2)} - C(f-(h+2), h)
C(f-1, h) = 2C(f-2, h) + 2^{f-(h+3)} - C(f-(h+3), h)
C(f, h) = 4C(f-1, h) - 4C(f-2, h) - C(f-(h+2), h) + 2 C(f-(h+3),h)
With some initial condition from trivial cases, namely C(f = h, h) = 1, and C(f < h, h) = 0, it's possible to come up with a matrix and do an eigendecomposition to find an explicit formula. But it will, in the case of 10 heads, require solving a 10th degree polynomial and inverting a 10th degree matrix. So in reality, you'll probably need a computer algebra system to handle this.
I've solved this for another case, and documented my work elsewhere, so I can tell you that P(15000, 15), for instance, is approximately 20.45%, and its exact value is a fraction which in lowest terms has a denominator with 4513 digits.
Part 2: It's easy from here
For a discrete distribution: PDF(n) = CDF(n) - CDF(n-1)
Part 3: Expectation value:
Sum PDF(n) n
It's possible to get an approximate answer by summing over a bunch of the smaller terms until it converges. However, I doubt a closed form solution will be forthcoming.
Build a man a fire, warm him for a day,
Set a man on fire, warm him for the rest of his life.
Out of curiosity I wrote a small program that simulates this. I used the ISAAC rng to generate random bits (ISAAC is a cryptographically strong RNG, which ought to mean very good quality of randomness and no bias for any of the bits).
Since this is very fast to simulate I ran it for 10 million iterations.
From all those 10 million tries, the minimum number of tosses was, rather unsurprisingly, 10. The maximum was 16929. The average number of tosses was 1023.2
(Conspicuously this differs from your calculated result by almost exactly 9. I wonder if you counted only up to the first toss that's followed by a string of 9 of the same tosses. Ie. you didn't include those 9 subsequent tosses into the count.)
Looks like my question is either ignored, or too much complicated, or too hard, or just not noticable.
Chanoyu, looks like you ignoring that 10 heads may appear as subsequence. Your calculations holds when you do: 10 toss. Is all heads or tails? no! Make fresh new 10 tosses. But what if it was 1 tail and 10 heads 9 tails? Your calculation assume those two as two fails in the row.
Surprisingly, exactly 1023 tosses. I was also surprised when I got this result. Perhaps I have mistake somewhere.Alright, I'll say a tiny info what I was doing: let say C(n) is number of combination we accept as succes during exact n tosses. Then G(x) is generator of this sequence, and a bit spoiler, it's like a tribonachi numbers but 9-nachi numbers. Generator easy derivated, then you just need to prepare it well and plug in certain number. Here is whole derivations without text: https://imgur.com/Cm81u7LProbably for n successive heads/tails formula is 2^n - 1. But I'm lazy to redo all derivations. Also I'm lazy replace number of tosses by new variable.
By the way, I don't know how you can apply this method to get number of tosses to get 10 heads. exactly heads! not something else.
Edit: Actually it's easy to prove that number for heads is exactly half of C(n), so for exactly heads you need exactly 2046 tosses in average.Edit2: Something is wrong with previous edit. My words about 2046 was out of thin air. Formulas tells it should be 1023/2 which is less, and it's strange. I'll investigateEdit3: Ah! now any amount of tails can be. So my first edit is wrong in this part. Sequence is completely new. Interesting.
This is certainly true of my solution where you do not decide heads or tails first, but I don't think it applies for the solution where you do decide that before the first cointoss. I'll try whether I can understand it a bit better with some simulations myself.
Edit: I ran the simulations. I discovered my estimate of 2036 was wrong, by 10, because what I said ("'10 (the succesful sequence) -> 1*10 = 10 "') is untrue: the sequence can of course still fail on the 10th cointoss. If it doesn't, only then the succesful sequence is reached of 10 additional cointosses. My program does not disagree (in running it three times I got 2045, 2024 and 2057; for lower numbers of consecutive heads it agreed closer). So either I'm right in this case, or my code is wrong also.
If I apply the same correction to my 'don't care about heads or tails' solution, I get 1023 average tries. My code does not disagree, in multiple tries it hovers about that number as well. It really may have been that simple.
This is my python code, with commented what is only for both heads and tails instead of only heads:
import random
consecutive = 10
coin = ('h', 't')
totaltosscount = 100000000
tosscount = 0
tossesneeded = []
sequence = ''
heads = ''
#tails = ''
for i in range(consecutive):
heads += 'h'
# tails += 't'
for i in range(totaltosscount):
tosscount += 1
sequence += random.choice(coin)
if len(sequence) == consecutive:
if sequence == heads #or sequence == tails:
tossesneeded.append(tosscount)
tosscount = 0
sequence = ''
else:
sequence = sequence[1:]
print(sum(tossesneeded)/len(tossesneeded))
I once learned this one method one can use to avoid having to collapse an infinite series when calculating expected values.
First, the basics of expected values. Let's toss a die with numbers 1 to 6.
You calculate the expected value by multiplying the probabilities with their outcome:
E = (1/6)*1 + (1/6)*2 + (1/6)*3 + (1/6)*4 + (1/6)*5 + (1/6)*6 = 3.5
Now let's toss a coin and ask for the expected number of tosses until the first Heads.
There is a 1/2 probability it's the first one, so 1 toss. And then there is a 1/2 probability it's something higher than 1.
E = (1/2)*1 + (1/2)*X
You could take that X and expand it into another split case, like X = (1/2)*2 + (1/2)*Y (we multiply the first probability with 2 since we're now at the second toss). Of course, this leads to an infinite series.
But, here comes the trick. Instead of expanding it into more and more split cases, use a self reference somehow.
E = (1/2)*1 + (1/2)*(E+1)
If the first toss wasn't Heads, then we're back at the start with the same expected value, except we already have one toss. It's then simple to calculate that E=2, the expected number of tosses until the first Heads.
Let's use a simplified version of the question. Instead of 10 consecutive tosses, we ask for the expected number of tosses until you have 5 consecutive ones.
I split the total expected value up into Heads or Tails separately. E_h := the expected value for number of additional tosses until 5 consecutive Heads, after already tossing one Heads. And E_t respectively. (Due to Heads and Tails both having a chance of 0.5, it's intuitive that E_h = E_t)
Then it's easy to see that our total E is:
E = (1/2)*(E_h+1) + (1/2)*(E_t+1)
For E_h (or E_t) we now use the trick above.
We start with one H tossed:
HT HHT HHHT HHHHT HHHHH
E_h = (1/2)*(E_t+1) + (1/4)*(E_t+2) + (1/8)*(E_t+3) + (1/16)*(E_t+4) + (1/16)*4
(Note how E_h references E_t, and vice versa, but we can write them both out, insert them, and see that they're indeed the same, I left this out. This also means that it's just E = E_h+1)
E_h = (15/16)*E_h + (30/16)
E_h = 30
E = 31
It's easy to see how that long sum for E_h gets longer with a higher toss requirement, but it can be simplified. Checking the answer for numbers 2,3,4,5 gives 2,6,14,30, so you might guess it's 2^n - 2.
E = 2^n - 2 + 1
E = 2^n - 1
So for n=10 it is indeed 1023 tosses.
Warning: Might glitch to creditsI will finish this ACE soon as possible
(or will I?)
Suppose that the probability of an everlasting colony is x. After a time lapse, the bacterium will either split or die. If it splits, we have two bacteria. The probability of both of them not generating an everlasting colony is (1-x)^2, so we'll have an eternal colony with probability 1-(1-x)²=2x-x².
Therefore, we have x = p(2x-x²), so either x=0, or 1 = p(2-x) => x = 2 - 1/p
Since x is a number between 0 and 1, the second solution only makes sense for p >= 0.5.
In particular, for p<0>0 for all N and still satisfy that limit.
Now, for p >0.5, I will argue that the correct solution is x = 2 - 1/p. If it were x=0, we would have that the expected number of bacteria after N events is 0, but it's in fact (2p)^N, which does not tend to 0 if p>0.5. So the probability should be positive, and therefore x = 2 - 1/p.
EDIT:
I might do this one as well:
I'll give a more formal derivation of Masterjun's answer. The ideas are already there. This process can be modeled as an absorbing Markov chain. They have many interesting properties.
Let us first start with the problem of getting all heads. You can model the problem as having 11 states.
0 -> 1 -> 2 -> 3 -> ... -> 10
For each state, you have 1/2 chance of going from state n to n+1, and one half of going from n all the way back to 0. If you reach 10, you stay there forever, it's what we call a terminating state.
Now, if we call E(n) the expected number of steps until we get to 10 when we start at n, we have
E(n) = 1 + E(0)/2 + E(n+1)/2,
and of course,
E(10) = 0
Rewriting, we get
E(n+1) = 2*E(n) - E(0) - 2,
so:
E(1) = E(0) - 2
E(2) = E(0) - 6
E(3) = E(0) - 14
...
E(n) = E(0) - 2^(n+1) + 2
Since E(10) = 0, we must have E(0) = 2^11 - 2 = 2046. If we just want N consecutive flips, the formula for the expected value is simply 2^(N+1) - 2.
Now, if we want either 10 heads or 10 tails, we could (if we want to) set up another Markov chain.
From an initial state 0, we can go to either 1H or 1T. From there, the chains differ:
1H -> 2H -> 3H -> 4H -> ... -> 10H
1T -> 2T -> 3T -> 4T -> ... -> 10T
The transition from (n)H -> (n+1)H has 1/2 probability. The other possibility is (n)H->1T, which also has 1/2 chance. 10H and 10T are terminating states.
The thing is: the H and T chains are completely symmetric, so the expected values for both of them should be the same. In practice, we can compress (1H,1T), (2H,2T), ..., into a single state.
At the end of the day we are left with a state 0 that goes to a Markov chain which is just like the previous one we discussed, with one node less.
Therefore, we should have E(0) = 1 + E(1) = 1 + 2^10 - 2 = 1023
I remember there's a way to transform problems like this into resistor networks, but I can't figure out how, right now.
--------------
An interesting challenge, which I think you need a computer to do, would be to find the expected value for finding 5 heads then 5 tails (or vice versa), that is, HHHHHTTTTT, or TTTTTHHHHH. The value should be less than 1023. Why?
I don't know what is simple. Answer number is simple, yes, but none of derivations of it is simple.
I don't know how you guys derive it using some recurrence relations with expectation or markov chains, but I was able apply method I was using before, to the case where you expect exactly heads. And for n heads we need 2n+1-2 tosses. Derivation: https://imgur.com/qA5d8V4
Forget about my previous post about you needing a computer to find the expected value of different coin patterns. I dug up stuff on the internet and found out that there's an amazing method using martingales that destroys the problem!
If you want to Google it yourself, look for the Abracadabra problem. Anyway, I'll try to simplify the proof removing the martingale formalism.
Suppose you have a casino, and in your betting game you specify a sequence of N coin flips. Let me take HHTT as an example.
Suppose that, at each time n, a gambler arrives at the casino and bets 1 dollar. At this moment, this gambler will bet that H will come up, and the casino will flip the coin randomly.
If the gambler is correct, and the coin does come up heads, the casino will pay back 2 dollars. At the next event, the gambler will bet these 2 dollars in the next coin in the sequence (in our case H), notice that this does not stop the next gambler arriving at n+1! They always arrive, no matter what the coin flip turns out. For every gambler, if he's correct, he gets back double what he bet. If at any point the gambler is incorrect, he will simply lose his bet and leave the casino.
Notice that this game is fair for the gamblers, as can be seen by calculating their expected payoff. If the gambler gets the entire sequence of 4 coin flips right, they leave the casino with 2^4-1 = 15 dollars. If they lose at any point, they leave with -1 dollar. Calculating the odds
E = (1-2^-4)*(-1) + 2^-4*(2^4-1) = 0
If the game is fair for the gamblers, it's also fair for the casino. In particular, for every gambler that loses, the casino takes 1 dollar, and for every gambler that wins, the casino pays 2^4 - 1. Alternatively, for every event where a gambler arrives (no matter if he wins or loses), the casino takes 1 dollar. And for every winning gambler, pays back 2^4.
Now, if the game is fair, it should also be fair when the sequence of coin flips HHTT just showed up. Assume it takes on average t steps for this sequence to show up. At that point, the casino got t dollars from gamblers and had to pay back 16 dollars to the gambler that won. That means
E[t-16]=0, but since expected value is linear:
E[t] = 16
So it takes 16 steps for HHTT to show up!
Now, for HHHH, notice that when the last H flips, not only the gambler that bet on HHHH won, the three others who arrives after him and bet on HHH, HH, and H are also winning. So, in this situation, the casino must pay 16 + 8 + 4 + 2 = 30. So it takes 30 steps for HHHH to appear.
If you think about it, you can derive a formula, the number of steps is
t = sum(2^s) for every s>0 where the suffix of length s is also a prefix of the string.
In particular, for 10 H's, every suffix of length from 1 to 10 is also a prefix, so we must sum all powers of 2 from 1 to 10, which gives 2046, as mentioned earlier.
The method used is known as the optional stopping theorem. It looks scary, but it's basically my argument here, it has some ugly conditions (that's because for a general martingale, we cannot do the change of summation order we did here), but believe that they are satisfied in this problem.
From this and after, I'm unsure.
Ok. I was able to use Masterjun approach (or idea at least) for HHHHTTTT, and for HHHHHTTTTT.
I had mistake in first automata: https://imgur.com/iz5RiTm
Some arrows are steps by H, and other arrows are steps by T. So, mistake was that I need arrow from 4-th H to itself, instead of arrow back to 0.
Then, derivation is here, for curious.
Indeed, in average you need 256 tosses for HHHHTTTT and 1024 for HHHHHTTTTT.
OK, this result holds quite generally, but proving it indeed is tricky. So, because of that, I'll go into formalism (I warned you! :P)
What is a martingale? A martingale is a process that is "fair". One example is to keep flipping a coin and every time you get heads, receive 1 dollar, and for every tails, pay 1 dollar. As you can see, the expected return of playing this game is 0. Suppose you got really lucky and amassed 10 dollars playing this, it still makes no sense to keep playing, the expected return is still 0. Likewise, if you were unlucky and lost 10 dollars, it makes no sense to keep playing to recover money.
This is the idea of a martingale. After drawing from random variables, the information you get by looking at the sequence of draws (actually, in general, it's modeled as a filtration, but we don't need to get into theses details) does not help you increase your odds at all.
The game of the gamblers against the casino, where they win by guessing the coin results, and runs infinitely many times, is also a martingale. If we have a game that runs forever, if we set a condition for each to stop, (in our case, we stop the game when the sequence turns up), it's a general result that (subject to some conditions on the stopping procedure), the stopped process of a martingale is also a martingale, and thus, "fair".
Intuitively, think of it like this. Suppose you go to the casino with 50% chance of doubling, so that the expected return is zero. If the stopping process of this martingale were not a martingale, then you could beat the casino by going there, and doubling your bet every time you lose, and when you gain one dollar, leave. In fact, what happens is, you start with a finite amount of money, so you have a high chance of getting out of the casino 1 dollar richer, but have a small chance of losing everything, so the betting strategy does not help you at all.
I don't want to get a lot of text about it, but I'll phrase precisely what I don't understand:
p4wn3r wrote:
Now, if the game is fair, it should also be fair when the sequence of coin flips HHTT just showed up.
What means to be fair at some moment?
p4wn3r wrote:
At that point, the casino got t dollars from gamblers and had to pay back 16 dollars to the gambler that won.
How do you calculate that at some moment they spent X money? It looks like "out of thin air".
p4wn3r wrote:
E[t-16]=0
Where this equation came from? What is t?
The more I ask, the more I understand.
Looks like, you not calculating expectation. You just calculate balance of casino. And you assume all of them betting HHTT. And at that moment, at point HHT arrived, ppl already leaved, who was betting. And next three will get HTT, TT, T and you may calculate do they leave already or not. Or, maybe I don't understand again.
But I see here relation with how you could propagate expectation in automaton that I was drawing. You can move forward and cut loops using infinite sum of geometric progression. You'll get same result. It's tricky though.
Now, if the game is fair, it should also be fair when the sequence of coin flips HHTT just showed up.
What means to be fair at some moment?
It means that, if the game, played infinitely, is fair, then the game, when it's played only until the sequence HHTT shows up, should also be fair.
r57shell wrote:
p4wn3r wrote:
At that point, the casino got t dollars from gamblers and had to pay back 16 dollars to the gambler that won.
How do you calculate that at some moment they spent X money? It looks like "out of thin air".
We can see the game as: every turn a gambler arrives and gives a dollar to the casino. When a gambler gets N coin flips right, the casino loses 2^N.
r57shell wrote:
p4wn3r wrote:
E[t-16]=0
Where this equation came from? What is t?
t is the amount of dollars the casino receives from gamblers, which, by the arguments above, is also the number of turns on average that it takes for the sequence HHTT to show up. Since the game is fair, the expected balance is 0, which means that the number of turns on average for the sequence to show up is 16.
r57shell wrote:
Looks like, you not calculating expectation. You just calculate balance of casino.
The whole point of the argument is that the balance of the casino is intrinsically tied to the expected number of turns for the sequence to show up. They are so tied together, that evaluating the balance allows you to evaluate the expected number of turns.
----------------------------
Now for another problem. Some time ago there was a lot of discussion on the internet about the sum of all natural numbers. In the video I linked, Frenkel gives his interpretation of the formula that sets this value to -1/12.
I will give my point of view looking at it from algebraic number theory, which, although many people say it's an area of math, it is not. It is a philosophy of life. One interesting theme that emerged from algebraic number theory around the first half of the 20-th century is that one should not privilege the real numbers as the most natural completion of the rationals. In fact, other completions like the p-adic numbers are possible, and in some ways are more useful, and in fact can be used to give a rigorous definition of some nonsensical identities.
So, for this exercise, replace the usual absolute value of the real numbers with p-adic norm for p=2, also known as the 2-adic norm or the dyadic norm. Prove that, with respect to this metric:
1 + 2 + 4 + 8 + ... = -1
Now, if the game is fair, it should also be fair when the sequence of coin flips HHTT just showed up.
What means to be fair at some moment?
It means that, if the game, played infinitely, is fair, then the game, when it's played only until the sequence HHTT shows up, should also be fair.
Ah, alright. Now I get it. So basically, imagine casino which close its business as soon as HHTT arrived, and all gamblers also await HHTT. And, expected balance is still 0 in the end, so at last step should be just total wins. So for HHTT wins only first gambler, and all other loose. But for HHHH all of them win so sum of powers of two. Ok. Now all clear. Suffixes and prefixes is also clear now.
Didn't get proof though, nevermind.
So, for this exercise, replace the usual absolute value of the real numbers with p-adic norm for p=2, also known as the 2-adic norm or the dyadic norm. Prove that, with respect to this metric:
1 + 2 + 4 + 8 + ... = -1
Well, it seems no one solved it. It's not very hard, though.
Recall the notions we need to define "convergence". The real numbers arise when we introduce a norm in the field of rational numbers, and define them as equivalence classes of Cauchy sequences.
The norm must satisfy some properties:
(1) |x| >= 0
(2) |x| = 0 => x = 0
(3) |xy| = |x||y|
(4) |x+y| <= |x| + |y|
The p-adic norm is defined as follows. 0 has norm 0, while for any nonzero negative number, we may write it uniquely as x = p^n*a/b, where a and b are coprime and not divisible by p, and n is any (possibly negative) integer. In this case |x| = p^(-n)
It's possible to prove that this norm satisfies all properties (1)-(4). In fact, it satisfies a stronger version of (4): |x+y| <= max(|x|,|y|). This relation is usually called the non-archimedean triangle identity. Anyway, since (1)-(4) are satisfied, the convergence theorems in the reals all hold for p-adic numbers.
In particular, the series 1 + 2 + 4 + 8 + ... converges to - 1 if the sequence of partial sums sn satisfies |s(n)+1|<epsilon, after some number N(epsilon).
In fact, we have s(1) = 1, s(2) = 3, s(3) = 7, ..., s(n) = 2n-1. So, |s(n)+1| = |2^n| = 2^(-n) <epsilon> -log2(\epsilon), we satisfy the required identity.
Another way is to prove convergence, without calculating the value. For example, the n-th term added, 2^n, has norm 2^(-n), and by the ratio test, the series should converge. Therefore, you are allowed to do crazy stuff like:
S = 1 + 2 + 4 + 8 + ... = 1 + 2(1+ 2 + 4 + ...) = 1 + 2S => S = -1
If anyone is still interested in exercises like this, another one would be proving
1 - 2 + 4 - 8 + 16 - ... = 1/3
Again, using the 2-adic norm. It's the same idea.