communicated to me by co-blogger Steve Miller, which he's too busy to post. Two players, A and B, start out with $a and $b dollars, respectively -- where a and b are natural numbers. They flip a fair coin, and every time it comes up heads, A gives 1 dollar to B; for every tails, B gives 1 dollar to A. The game ends when one of the players has zero dollars (and the other one has a+b). What's the probability that player A wins?
I was able to guess the answer right away (Steve told this to me on the phone), but this is probably more luck than anything else, as these intuitions can often be misleading. In any case, an answer is worthless without a proof, which Steve tells me is not entirely trivial. Furthermore, we don't know what happens if the coin, instead of being fair, has bias p. Anyone out there know?
Subscribe to:
Post Comments (Atom)
18 comments:
The expected wealth of the first player at every point in the future = a. When the game ends, the first player has either a+b dollars or zero. So the chance that the first player wins is a/(a+b).
For general p, I believe a little options-pricing theory tells us that the solution is that the probability is:
{sum(from i=0 to i=a-1)[((1-p)/p)^i)]}/{sum(from i=0 to i=a+b-1)[((1-p)/p)^i)]}
which for p=1/2 reduces to a/(a+b), as it should.
Thanks, Brad -- that's a slick solution (I still have to work out the one for general p).
Unless I'm missing something..
Why not to construct a (finite) Markov Chain with a+b+1 states labeled 0,1,...,a,a+1,...,a+b
and define the wealth of A at every step. All transition probabilities towards state a+b are with probability p and all transitions towards zero are with probability 1-p. The probability that A wins is the probability of being in state a+b before reaching state 0. Now, define u_i to be the probability of reaching a+b before reaching 0 when starting at state i, for each 0<=i<=a+b and obtain a set of linear equations in u_i. You are interested in u_a.
DAHbKA.
Даньчик, ма нишма? :)
Great idea! I think it can even be taken a step further. We can define the Markov chain on 1+a+b states exactly as you say, and make the states 0 and (a+b) absorbing (once you hit one of them, you stay there). This defines a homogeneous ergodic Markov chain, which has a unique stationary distribution, given by the top eigenvector of the transition kernel. The stationary distribution should only have positive values at the two extreme states 0 and (a+b) -- giving exactly the probability of ending up in one of them. I haven't worked this out (why spoil the fun for the readers) -- but does this sound right?
נשמע היטב!
In fact, this is much better way to think of it. After giving similar question as a HW in Random Signals (in fact stolen from another course :)) - I have some mental barrier to reduce the original problem into one with absorbing states.
For given a+b, the probability of winding up in the top absorbing state depends on the value of a, so you need more information than just the Markov transition matrix can give you. (1, 0, ... ,0) and (0, ... ,0, 1) are both eigenvectors corresponding to the unit eigenvalue. There's no information about the relative probabilities of winning there.
Take a look at: http://delong.typepad.com/sdj/2007/01/a_small_matter_.html
Well, I feel stupid. It's not an ergodic chain, since the two extreme states are absorbing. Hence, no unique stationary distribution, hence my analysis breaks down.
Thanks for the careful analysis, Brad -- looks like you've nailed this one!
One more afterthought on Brad's solution -- I'm guessing it generalizes nicely to inhomogeneous (that is, time-dependent) p. Right, Brad?
And finally, I must remark on the wonders of blogging -- how else could one get such high-quality feedback so quickly?
Yes. If the probability p is a function of t, then the value function is dependent on the time t as well as the state s and you get equations like:
V(s, t) = p(t)V(s+1, t+1) + (1-p(t))V(s-1, t+1)
...
Ergodic argiments indeed fail.
But what about the method with linear equations? It should work shouldn't it?
DAHbKA.
brad said... "At the start, player A has $a dollars. Subsequent coin flips do not change the expected value of A's wealth, so A's expected wealth is $a at every point in the future. When the game ends, A has either 0 or $a+b dollars. For her expected wealth to be $a, the probability that she wins--has $a+b dollars--must be equal to a/(a+b)."
It is far from clear that this is a valid piece of reasoning. Subsequent coin flips certainly do change the expected value of A's wealth. In a game that continues forever, or stops at an arbitrary time T, A's expected wealth at any time t is whatever she has left at that point in time.
But this is not true in a game with the stopping rule such as the one specified. Consider the alternative "game" where A starts with $1 and flipping continues until she has $0, and then stops. (There is no player B). At t=0, A's expected wealth after the first flip (at t=1)is $1. At t=0 her expected wealth after the second flip (t=2), contingent on the game having gotten that far, is $2. But that contingency occurs with probability 1/2, so we could say that her "expected wealth" at t=2 is $1.
But A will end up with $0 for sure before the game ends. Her expected payoff is $0 but her "expected wealth" at every point in time is $1?
You might be interested in this article by Doron Zeilberger.
Excellent posts. Much better than my semi-elegant proof. One of my favorite comments is "when all you have is a hammer, eventually all problems look like nails". This problem can easily be done by proving the following chain: (1) if a = b then the probability is 1/2; (2) more generally, if a+b = 2^k then the probability of winning is a/(a+b); for example, if a=6 and b=2 then the probability of going from 6 to 8 before going from 6 to 4 is 1/2, and there is a 100% chance of winning at 8 and a 50% chance of winning at 4; (3) if a+b = 2^k + 1 and b = 2 then the probability is a/(a+b), and this generalizes to any a with b=2; (4) if b=1 the probability is a/(a+b) (a head has the first person win with probability 1 and a tail has first person winning with probability (a-1)/(a+1) by induction; (5) by induction one can show that if the probability of winning when a=k and b=b is k/(b+k) then the probability of winning when a=k-1 and b=b is (k-1)/(b+k).
Nice!
Thanks
Post a Comment