He defines exponentiation for positive integers (but only really by example) and then once real numbers are established to exist he defines roots and proves they exist. But Rudin doesn't actually mention negative powers, despite proving some other very simple properties of addition and multiplication, like if x + y = x + z, then y = z. But I don't know that I'm not "allowed" to assume they exist. Perhaps the intention was I am allowed, it was just unclear to me, so I operated conservatively.
Build a man a fire, warm him for a day,
Set a man on fire, warm him for the rest of his life.
Challenge
There is a wheel that costs 1 coin to spin, and it rewards you according to the following probability table:
5/9 : 0 coins
1/3 : 2 coins
1/9 : 3 coins
You have 5 coins and you need at least 9 coins to purchase the prize you want. Using this wheel, what is the (exact) probability that you win those 9 coins (before you run out of coins and cannot play anymore)?
Observation: Because the EV is exactly 1 coin gained to every coin lost, this is more or less a martingale. However, I don't know much about martingales, so I'm not sure what to do with this information, other than to justify the approximation of x/y as the probability where x is the number of starting coins and y is the number of goal coins.
Solution method spoiler: I've already solved this problem and I have an exact answer using Markov chains, but because it relies on the analysis of a particular n+1 by n+1 matrix, it's not very amenable to extrapolation to large n, so I was hoping someone else had a more clever approach.
Solution form spoiler: The form of the solution is the ratio of two 7 digit numbers that roughly approximates 5/9.
Build a man a fire, warm him for a day,
Set a man on fire, warm him for the rest of his life.
In the past, I would have just been like "just set up the 10x10 matrix and solve it". Which is what you mentioned in your spoiler already.
However, earlier this year I was looking up information about expected length of random walks, and learned something that was intuitively simple, yet profound:
A random walk can be modelled as a recurrence relation.
Think about it. The rule for moving in a random walk is the same for a whole range of values, so it is just like a recurrence relation. The initial values would be the end states of the random walk (in this case, having 0 coins = 0 and 9 or 10 coins = 1 ), which would be used to determine the coefficients in the solution.
So for this example, letting xk be the probability of winning (at least 9 coins) given you have k coins, the recurrence relation is:
xk+1 = (5/9)xk + (1/3)xk+2 + (1/9)xk+3 for 0≤k≤7, x0=0, x9=x10=1
Or in other words:
xk+3 + 3xk+2 - 9xk+1 + 5xk = 0 for 0≤k≤7, x0=0, x9=x10=1
The characteristic equation is r3+3r2-9r+5, which factors as (r+5)(r-1)2 (it helps to see that, not only is r=1 a root of any recurrence based on a random walk, but r=1 has at least multiplicity 2 if it is a martingale). So the solution to this recurrence is of the form:
xk = A+Bk+C(-5)k for 0≤k≤10
Using x0=0, we get A=-C. With x9=x10=1 and A=-C, we get 9B-1953126C=1 and 10B+9765624C=1. Subtract the two resulting equations to get B=-11718750C and so C=-1/107421876. So A=1/107621876 and B=11718750/107421876 and so:
xk = (1/107421876)( 1 + 11718750k - (-5)k ) for 0≤k≤10
and with k=5, this gives x5 = 1627691/2983941 = 0.545483...
This model is definitely easier to extrapolate to large n (as opposed to dealing with (n+1)x(n+1) matrices!)
Regarding martingales and the approximation x/y where x is the number of starting coins and y the number of goal coins, you're probably thinking of the simplest martingale: losing or gaining 1 coin with equal probability (1/2) at each step. In that case, x/y isn't just an approximation; it's the actual probability! (Because xk = (1/2)xk-1 + (1/2)xk+1, every three consecutive terms, and thus the whole sequence, has to be in arithmetic progression.) The model given in the problem probably isn't too far off from this model then.
I have a very different method for solving the problem. It’s by no means as simple or elegant as FractalFusion’s approach, but I think it’s pretty cool and seems to get the right answer. I don’t know anything about martingales or Markov chains, but I do know a bit about scattering theory in physics, and I was able to translate the problem into a scattering problem. Apologies in advance as this will be a bit long and probably not very straight-forward to follow.
Imagine a network of nodes and links that looks something like this
– 1 – 2 – 3 – 4 – 5 – 6 – 7 – 8 –
The numbers (nodes) represent how many coins you have, and the dashes, which connect different nodes are going to allow you to move left or right within the network. The physics problem I’m imagining might be something like a series of optical waveguides (lines) with connections (numbers) at which light can scatter (either forwards or backwards). The idea is that we’re going to inject light into the network from the left side and find the probability (or intensity) of the light that makes it to the end.
Three things can happen at each turn in the game and each one corresponds to a different type of scattering in the network:
lose a coin (p = 5/9) = scatter left in the network
gain a coin (p = 1/3) = scatter right one node
gain two coins (p = 1/9) = scatter right two nodes (more on how this works shortly)
To model the light in the network I’m going to suppose there are 4 different channels in each connection, a, b, c and d. Channel a is for all light that propagates to the left. Channels b, c and d will all be for right-propagating light but have different functions with respect to the game. Channel b will account for the case of when you win 1 coin (move right one node), channel c for when you win 2 coins (move right two nodes), and d will let me ‘skip ahead’ so that the light I initially inject into to the system can straight to go node 5 (since the game starts with 5 coins). The light in each link can thus be described by a vector of four components and scattering at the nodes can be described by setting up 4x4 scattering matrices ( https://en.wikipedia.org/wiki/S-matrix ) at the nodes. I need three different scattering matrices: one for before the entry point (node 5), one for at the entry point, and one for afterwards.
I'm going to use plain variables for channels on the left side of a node, and a prime for channels on the right side of the node. For nodes 1, 2, 3, 4 we have the matrix equation (forgive the formatting)
a 5/9 0 0 5/9 b
b' = 1/3 1 0 1/3 c
c' 1/9 0 0 1/9 d
d' 0 0 1 0 a'
The way this works is that light injected into the skip channel (d) stays in there and won't be scattered into any other channel. This channel acts independently of all the others. Light that comes into the node from channel c on the left all goes into channel b on the right. This ensures that all the light that scatters into channel c from the left effectively skips ahead one node (so it corresponds to winning 2 coins in the game). At the entry node (node 5) we have
a 5/9 0 5/9 5/9 b
b' = 1/3 1 1/3 1/3 c
c' 1/9 0 1/3 1/9 d
d' 0 0 0 0 a'
which is more or less the same matrix, but now light coming from the skip channel gets evenly distributed among all the other options, which means light is now entering into the network at this point. For nodes to the right of the entry point the 'skip' channel becomes useless and we just have
a 5/9 0 0 5/9 b
b' = 1/3 1 0 1/3 c
c' 1/9 0 0 1/9 d
d' 0 0 0 0 a'
There may well be a simpler way to set everything up, but this approach seemed fairly intuitive to me.
Now we need to get the scattering matrix for the entire network so we can relate the channels at the far left to the far right. Then we can just inject light with intensity 1 into the skip channel at the left end and see how much comes out at the right end. One way to do this is to convert each 4x4 scattering matrix into a 4x4 transfer matrix (see for example http://assets.press.princeton.edu/chapters/s8695.pdf ). You can work these out by essentially finding the 'partial inverses' of the scattering matrices ( https://en.wikipedia.org/wiki/Partial_inverse_of_a_matrix ), although you need to be a bit careful with the matrix block structure. Once you have the transfer matrices you can just multiply them together to get the transfer matrix for the whole system, which can then be converted back into a scattering matrix. If I use M to denote transfer matrices, the transfer matrix for the whole system is just
M_right ^ 3 * M_entry * M_left ^ 4
Convert this back into a scattering matrix S and finally compute S * [0, 0, 1, 0]. My little Python program outputs
[a, b', c', d'] = [0.45451636, 0.45483641, 0.09064723, 0]
Finally, since the network stopped at node 8, we need to remember that I win if I'm in either channel b or c. b is light that entered 9 directly from node 8 (you won 1 coin when you had 8) and c is light heading towards an imaginary node 10 from node 8 (you won 2 coins from node 8). So we just need to add
0.45483641 + 0.09064723 = 0.5454836405947694
which seems to be at least close to the right answer.
There is an old question, which is easy to guess, but I have no idea how to prove. Perhaps there are some experts who can prove it.
Suppose you have a pizza. And there are two possibilities: you'll have party of N people or party of M people. (including yourself).
In what minimum number of slices (pieces) you need to cut this pizza, such that in both possibilities each member will able to get same amount of pizza without making additional cuts, and there will be no remaining slices (pieces) in both cases. Slices can be any size.
For example, N = 2, M = 3, answer is 4.
I'm interested in proof!
Edit: in one nice discord server one guy sent me link to a proof on youtube (: So I have a proof! Yay!
I would be interested to know as well, r57shell. I played around for a very short time with it, and it seemed complicated to me.
Challenge.
There is a box that magically produces mangos. You may check it at most once a day, and every day it has a 5% chance of spawning a mango, unless there is already a mango in the box. However, if you open the box to check, regardless of whether or not you find a mango inside, you cannot open the box for 3 days, and the box will not spawn a mango for those 3 days.
How often should you check the box to maximize your expected rate of mango acquisition?
Build a man a fire, warm him for a day,
Set a man on fire, warm him for the rest of his life.
I don't have a complete answer.
It is easy to show that M+N-gcd(M,N) pieces is enough. There is a cutting of the pizza into M equal slices, and a cutting of the pizza into N equal slices; overlay them together so at least one of the cuts from one of them coincides with at least one of the cuts from the other. Then there is an overlap of gcd(M,N) cuts, so there are a total of M+N-gcd(M,N) pieces.
The hard part is showing that you can't use less than M+N-gcd(M,N) pieces. I don't have an answer for that.
First, let n be the number of days (inclusive) from when the box starts producing mangos until you open it. Then the expected rate of mango acquisition is (1-(0.95)n)/(n+3).
Next, I prove the following statements (n is a positive integer):
(0.95)n < 20/(n+23) for n≥10
(0.95)n > 20/(n+23) for 1≤n≤9
For the first, (0.95)10 < 20/(10+23). Now assume (0.95)k < 20/(k+23). Then (0.95)k+1 = (19/20)(0.95)k < (19/20) 20/(k+23) = 19/(k+23) < 20/(k+24) = 20/((k+1)+23), thus proving the first by induction.
For the second, (0.95)9 > 20/(9+23). Now assume (0.95)k > 20/(k+23). Then (0.95)k-1 = (20/19)(0.95)k > (21/20) 20/(k+23) = 21/(k+23) > 20/(k+22) = 20/((k-1)+22), thus proving the second by reverse induction.
Why do I look at these two statements? Because:
(1-(0.95)n+1)/(n+4) < (1-(0.95)n)/(n+3) is equivalent to (0.95)n < 20/(n+23), and
(1-(0.95)n+1)/(n+4) > (1-(0.95)n)/(n+3) is equivalent to (0.95)n > 20/(n+23).
These can be verified by expanding the left inequality and solving for (0.95)n.
This means that (1-(0.95)n)/(n+3) is decreasing on [10,∞) and increasing on [1,10].
So the max of (1-(0.95)n)/(n+3) over all positive integers n is n=10. Putting in the value gives:
(1-(0.95)10)/(10+3) = 0.0308663...
So n=10 gives the highest expected rate at (1-(0.95)10)/(10+3) = 0.0308663... mangos per day.
This post was inspired by this week's Riddler Classic but is not strictly about that problem. Instead, it's what appears to be a counterexample to the (negative) solution to Hilbert's third problem. I'm hoping someone can help me figure out where my reasoning is presumably wrong. I'm also going to avoid drawing any figures, but I can produce them if you think it would be helpful.
As part of attempting the Riddler Classic, I broke things down into a four dimensional region in which 0<w<x<y<z<1. (You can sort of see why I would want to do this, with w, x, y, and z being the four teams in increasing order of quality.) Like your typical mere mortal, I'm no good at visualizing things in four dimensions so I decided to draw the corresponding problem for three dimensions: 0<x<y<z<1. This volume is a tetrahedron embedded within the unit cube. With a little reasoning, you can determine the edges of this tetrahedron:
x=y=0, z free
y=z=1, x free
x=0, z=1, y free
x=0, y=z
z=1, x=y
x=y=z
Upon inspection, you can see that each of these edges is indeed on the boundary of 0<x<y<z<1. There are six of them, so it is a tetrahedron as well.
Well here's the insight: My choice of 0<x<y<z<1 is arbitrary and I can permute the order of these variables in six different ways (3!=6). Every point within the cube must be represented by one of these six permutations and by symmetry, each such tetrahedron must have a volume of one-sixth of the cube. Since it's a unit cube, the volume of any such tetrahedron is 1/6.
We can proceed from here in two equivalent ways. First, on its face, I've determined the volume of a certain tetrahedron from its base (which we know to have area 1/2) and it tapers uniformly to a point from that base to produce total volume 1/3*1/2 = 1/6. Alternatively (what I did), we can consider a particular second permutation: 0<y<x<z<1. I was able to show that this tetrahedron lies adjacent to the first tetrahedron such that the entire z=1 face of the cube is represented by both together and they taper to a point at x=y=z=0. This is thus a pyramid whose base is area 1 and height is 1 for a total volume of 1/6 + 1/6 (the two tetrahedra) = 1/3.
This bugs me. Hilbert's third problem was motivated by failed attempts by Gauss to produce a cut-and-glue formula for the volume of a pyramid and I seem to have done essentially that. It was later shown, using mathematics beyond my pay grade (Dehn invariants), that such a solution does not in general exist. Furthermore, once we have the volume of one pyramid-like solid, we can apply uniform scaling and/or Cavalieri's principle to show that the volume of any pyramid-like solid must also be 1/3 * base * height. I've proved this formula (Maybe? I think?) without using a limiting process or calculus, which Wikipedia seems to indicate is strictly necessary.
So what gives? My reasoning seems pretty ironclad, so my guess is that I've just misinterpreted the thrust of Hilbert's third problem. Wikipedia also seems to suggest that Cavalieri's principle arises from calculus but... that seems like some pretty weak calculus. The essence of Cavalieri's principle can be taught to a young student, years before they do any calculus whatsoever.
Here is a proof for pizza question. By the way, I didn't know proof before hand, I asked here in case someone knew / could find a proof. So, here is at least some proof.
https://www.youtube.com/watch?v=h0YHMOkKsy4 check out my comment to this video for additional details (if it's still there)
Main problem for this task is... cost function. You can't just take a volume, because it's not what asked. What asked is quality of champion on average. And for chosen tuple of (a,b,c,d) with properties a<=b<=c<=d, it's already complete mess! I don't know how to simplify it.
Well, Hilbert's third problem concerns whether, given any two polyhedra of equal volume, there is a general method to cut the first one into a finite number of pieces and rearrange into the second, and the answer was proven to be no. That you have found a cut-and-glue method for specific polyhedra does not change this answer.
Also, the usage of Cavalieri's principle is outside the scope of this problem; it is not considered a cut-and-glue method in the context of Hilbert's third problem.
I'm looking for some help with creating a formula for the following situation:
Suppose that a script exists which can alter an input value based on a user's request (and the input is 1 byte, storing values between 0 - 255).
The User can alter the following parameters, and the script will alter (or choose not to alter) the input on each frame relative to its initial starting value, which will always be 128 for this example:
------------------------------------------
1. lowerSubtractionLimit - the maximum amount that could be subtracted from the input
2. upperAdditionLimit - the maximum amount could be added to the input
3. probabilityOfRandomInput - the probability (between 0 - 1) that the input is changed at all from it's starting value.
4. currentValue - a constant, which in this case is always 128 (not specified by the user).
For simplicity sake, I'll define the following 2 additional variables, which will effectively replace the first 2:
1. lowestPossibleValue - equals 128 minus lowerSubtractionLimit. This represents the lowest possible input that could be selected to replace the current one (even if it's less than 0)
2. highestPossibleValue - equals 128 plus upperAdditionLimit. This represents the highest possible input that could be selected to replace the current one (even if it's greater than 255)
--------------------------------------------
After the user selects the values for the non-constant variables, the following logic is applied on each frame in a while loop:
while TRUE
1. A random number is selected, which has the same percent chance of being in a specified range as probabilityOfRandomInput has (ex. if this value is 0.23, then there is a 23% chance that the random number is in the specified range). If the number isn't in the specified range, then the input is unchanged (i.e. it remains 128), and we skip ahead to step 6. Otherwise, we go to step 2.
2. A SECOND random number is now selected, which is a random number in the range of abs(highestPossibleValue) + abs(lowestPossibleValue). This number is then added to the lowestPossibleValue to get the final value that we want to set the input to. We'll call this "newValue".
3. If newValue is less than 0, then it is set to 0.
4. If newValue is greater than 255, then it is set to 255.
5. We now set the input equal to newValue.
6. FRAME ADVANCE HERE
END
------------------------------------------------------------------------------------------------------------------------
Final Question:
In terms of lowestPossibleValue, highestPossibleValue and probabilityOfRandomInput, what is the formula to calculate the expected average value of the input after running the above logic for many frames?
Here's one that I came across recently. It's simple but tricky.
Suppose you have a two-dimensional affine transformation T defined by six parameters as:
x' = a + bx + cy
y' = d + ex + fy
From this transformation derive another transformation U such that U flips the y axis, that is, if we let (x,y)* = (x,-y), we should have for every point:
U(x,y)=T(x,y)*
How do the coefficients of U relate to those of T?
If T(x,y) = (x', y'), and U(x, y) = (x', -y'), then -y' = -d - ex - fy, which seems to suggest that the answer is simply "you negate d, e, and f." But you mentioned it was tricky, so I fear I am almost certainly missing something?
Build a man a fire, warm him for a day,
Set a man on fire, warm him for the rest of his life.
Hmm, apparently not even I know what's going on, since I phrased the question wrong xD.
Anyway, here's the context. I was coding a renderer for fractal flames, and when I got a sample picture, I realized that normal renderers were plotting the picture with the y axis flipped. I had no idea why, because I had remembered to flip the y axis when saving the image.
Eventually I realized that it was all because my renderer was serializing the coefficients of the affine transformation differently than traditional ones, and that was causing the flip. For example, when regular software showed a positive coefficient somewhere, it would save a negative one in the file anyway. So I had to figure out how to change them so that I got a flipped transform. The answer, in this question's notation, is to actually negate d, e, and c.
My reasoning was: the point (0,0) was sent to (a,d), now it goes to (a,-d), so d must flip. (1,0) was sent to (a+b,d+e), now (a+b,-d-e), so e must also flip. But the y axis was also flipped, so if before we had (0,1) sent to (a+c,d+f), we now need (0,-1) to go to (a+c,-d-f). For this to happen, we need to keep f the same, and flip c.
But now I realize that the transformation U we need is actually defined by U(x,-y)=T(x,y)*. Does anyone know why it's this transformation we must consider instead of the other definition?
Your question appears to be about change of basis, where you are taking a transformation T over an old basis and needing to represent it as a transformation U over the new basis. (Here, "old basis" is the original basis you intended and "new basis" is the basis that the program is using which is not the same as yours).
If P is the change-of-basis transformation that represents the new basis in terms of the old basis (P goes from new→old), then:
U=P-1TP
This is because, given a point in the new basis, you first have to return it to the old basis (P), then transform that in the old basis (T), then bring it back to the new basis (P-1).
Here, P(x,y)=(x,-y) and P-1(x,y)=(x,-y), so:
U(x,y) = P-1TP(x,y) = P-1T(x,-y) = P-1( a+bx-cy , d+ex-fy ) = ( a+bx-cy , -d-ex+fy ),
which is as you said: c, d and e have their signs flipped.
Note that the above is the same as writing U(x,y) = (T((x,y)*))* ( if we were to use the notation * )
A combinatorial problem from Riddler. I'm paraphrasing the problem here:
----
Consider sequences consisting of digits 1,2,3, ..., n (think of it like a number in base n+1). The sequence must have the following properties (order of digits is from left to right):
* The first digit is always 1 and the last digit is always n. 1 and n cannot show up anywhere else.
* If a digit (other than the last digit) has a value of k, the following digit must be k+1, unless one chooses instead to perform a revert (see below).
* A revert is a descent from a digit k to the following digit i, where 2≤i≤k-1, subject to the rule that, once a revert is performed to a digit i, any future revert must be to a number greater than i. That is, the numbers reverted to (if any) must form a strictly increasing sequence from left to right.
For example, if n=8, possible sequences include (adding underscores to show reverts):
12345678 (no reverts)
1234567_234567_34567_4567_567_678 (maximum length, going up to 7 each time reverting to 2,3,4,5,6)
12345_234_34567_5678 (reverted numbers are 2,3,5; note that this is strictly increasing)
The question is: For n=8, how many different possible sequences of this kind are there? How about for any integer n≥3 in general?
Let f(n) be the number of sequences starting at 1 and running until n. Note that after a revert to i the total options for the sequence starting at i and going to n is the same as starting at 1 and running to n-i+1, so this gives us a recurrence. Let f(i, n) = f(n-i+1)
Our trivial base cases are f(1) = 1, f(2) = 1, f(3) = 1. From f(4) when we reach 3 for the first time we can choose to revert to 2 or not, so f(4) = 1 + f(2,4) = 1 + f(3) = 2. For f(5) we can make that choice at 3 or 4, additionally, at 4 we have the choice to revert to either 2 or 3: f(5) = 1 + 2f(2,5) + f(3,5) + = 1 + 2f(4) + f(3) = 1 + 4 + 1 = 6. Enumerating to double check 12345, 1234_345, 1234_2345, 1234_234_345, 123_2345, 123_234_345. Checks out. It seems that, in general, we have f(n) = 1 + n-3 f(2, n) + n-4 f(3, n) + ... + f(n-2, n) = 1 + sum_i=0^n-4 (n-3-i) f(2+i, n) = 1 + sum_i=1^n-3 (n-(i+2)) f(n-i)
Because our indexing seems to be mostly involving offsets of 2 (because the stubborn start and end not participating like other values). Let's call g(n) = f(n+2). Now we have the relatively simpler, g(n) = 1 + sum_i=1^n-1 (n-i) g(n-i) = 1 + sum_i=1^n-1 i g(i)
If we take successive differences we have g(n) - g(n-1) = (n-1) g(n-1). Rearranging a bit we have: g(n) = n g(n-1), but that's the recurrence for a factorial function! (Which is what we have, the next few values of f(n) are 24, 120, 720, ...)
Build a man a fire, warm him for a day,
Set a man on fire, warm him for the rest of his life.
A short note regarding the previous post: I did similarly to OmnipotentEntity, but it turns out that Riddler has an explanation which doesn't even use recurrences.
----------------------------
I was thinking of something very recently.
On the left is the classic 3-4-5 triangle, which has an area of 6. Notice the sides are in arithmetic progression.
Now, is 3-4-5 the largest (by area) triangle with sides in arithmetic progression having one side of length 3? The answer is no. There is the 3-5-7 triangle on the right whose area is 15sqrt(3)/4 ≈ 6.495.
Question: What is the largest (by area) triangle with sides in arithmetic progression having one side of length 3?
Note that the other two sides can be any real numbers (do not need to be integers).
This looks like a job for Hero's Formula!
sqrt(s(s-a)(s-b)(s-c)), we want the stationary point, which will obviously need to be a maximum, substituting in s = 1/2 (a+b+c), and a=3, b=3+k, and c=3+2k, and simplifying a little we get.
3/4 sqrt(-k4-4k3+6k2+36k+27)
Recognizing that we just want the stationary point of this, and the form of the derivative of the sqrt(f(x)) is C f'(x)/sqrt(f(x)) for some constant, the zeros will be isolated in the numerator. Thus will be the same as the zeroes of -4(k3 + 3k2 - 3k - 9), which has three zeroes at -3, and plus/minus sqrt(3).
The area of the triangle 3, 3+sqrt(3), 3+2sqrt(3) is 3/2 sqrt(9 + 6sqrt(3)) = 6.6055...
There must be a better way to do this, because using Hero's formula directly is always heinous and way more complicated than it needs to be just as a rule of thumb, though the result itself is interesting.
Build a man a fire, warm him for a day,
Set a man on fire, warm him for the rest of his life.
I have a bag of tiles. The tiles can be either blue or yellow, but you do not know how many are blue and how many are yellow. You draw 12 tiles without replacement: 4 blue and 8 yellow.
Is the next tile you draw more likely to be blue or yellow?
Does the answer depend on the number of tiles originally in the bag?
What is the chance that the next tile you draw is blue?
Build a man a fire, warm him for a day,
Set a man on fire, warm him for the rest of his life.
Joined: 12/28/2013
Posts: 396
Location: Rio de Janeiro, Brasil
I think this requires a bit extra context? It feels like it's necessary to make some assumptions about the probability distribution of the colours in the bag.
Edit: If we assume the amount of yellow tiles follows an uniform distribution, it seems the probability of the next tile being each color doesn't depend of the total of tiles. Didn't think of a strictly formal proof but it looks interesting.
One simple case is drawing only a single tile. If it is yellow, the next tile is also yellow with probability 2/3.
Regarding the triangle problem I posted:
Obviously 3 is the smallest side; otherwise we can scale up the triangle to get one with larger area.
Hero's Formula is the most direct way of finding the other two sides, although it is more convenient to use a=3, b=r, c=2r-3 in the formula A=(1/4)sqrt((a+b+c)(-a+b+c)(a-b+c)(a+b-c)):
A^2 = (1/16)(3r)(3r-6)(r)(6-r) = (-9/16)r^2(r-2)(r-6) = (-9/16)(r^4-8r^3+12r^2)
So solve 4r^3-24r^2+24r=0 ----> r=0 (reject) or r^2-6r+6=0 ---> r = (6±sqrt(36-24))/2 = 3±sqrt(3) (reject negative)
So r=3+sqrt(3) and you can check it gives the maximum.
Now actually if you take 3, r, 2r-3 triangle with angle θ in-between the sides 3 and r (see below), using the Cosine Law and rearranging gives the polar equation r=4-2cos(θ) which is the limaçon shown below. The triangle 3,3+sqrt(3),3+2sqrt(3) gives the red triangle below, and you can see it coincides with the point on the curve with the greatest y-value. (The points on this curve represent the third vertex of a 3,r,2r-3 triangle, if the base is on (0,0) to (3,0). So the one with the greatest y-value will have the largest area.)
----
As for the bag of tiles question, I don't have an answer yet (I'm still trying to be sure of what is assumed. A priori, is each tile blue or yellow with equal probability, independent of the others?
So the initial probability that a blue or yellow tile is drawn is an unknown. Assuming the bag initially has 50% blue and 50% red tiles is not justified.
As you may have observed, the only assumption you need, once you condition upon the likelihood of the observation of the initial draw, is that the probability of any tile being drawn (independent of its color) is uniform.
Build a man a fire, warm him for a day,
Set a man on fire, warm him for the rest of his life.
So, if I'm getting this right, are you saying that we make no assumptions about the initial probability distribution (this is before anything is drawn at all)? Not even that the initial probability distribution is a binomial distribution?