This is a follow-up on my earlier
post on the total variation distance. As I already
mentioned, my
visit to Stanford and Berkeley were immensely useful, not least because of an opportunity to meet with the experts in a field to which I aspire to contribute. In that earlier post on total variation, I gave some characterizations and properties of TV, fully aware of the low likelihood that these are original observations. Sure enough, the relation

has been known for quite some time; see the book-in-progress by
Aldous and Fill, or the one by
Pollard.
The relation

also seems to be folklore knowledge; I have not seen a proof anywhere and give a simple (non-probabilistic) one
here (Lemma 2.6). Amir Dembo suggested that I re-derive this in a probability-theoretic way, via coupling. Here it is.
Recall that if

and

are probability measures on

then

,
where the infimum is taken over all the distributions on

, having marginals

and

, resp., and the random variables are

and

are distributed

and

. Any such joint measure on

is called a
coupling and one achieving the infimum is called a
maximal coupling.
Applying this to our situation, let us define the random variables

. Let

be a maximal coupling of

and

, and define similarly

for

and

. Notice that

is a (not necessarily maximal) coupling of

and

. Then
.
No comments:
Post a Comment