be a finite set and consider two probability measures (distributions) on
,
and
. One can define many notions of "distance" between
and
, but a particularly important one (at least in my work) is the variational distance. By definition,
-- that is, we maximize over all subsets of
the difference of the measures assigned by the two distributions. It is a well-known fact, routinely left as an exercize for the reader, that
where
is just the
norm of the difference
, viewed as a vector in
.It is also well-known that
,where the infimum is taken over all the distributions on
, having marginals
and
, respectively, and the random variables are
and
are distributed
and
.Let us define the (unnormalized) measure
as the pointwise minimum of
and
. A somewhat lesser-known fact is that
(I have not seen this mentioned anywhere, but can't imagine that I'm the first one observing this). It easily follows that if
and
are minorized by some probability measure
-- i.e., there is an
for which
holds pointwise, we have
.Finally, here is a relation that has a chance of being novel. Consider two finite sets
, with probability measures
on
and
on
. We will write
for the product measure on
(and similarly for
). Then
.To the best of my knowledge, this "tensorization" result was first proved here, but I'd be very grateful if anyone would bring an earlier reference to my attention.
None of these are difficult to prove, but if pressed, how would you do it? I have a simple technique, based on a linear programming principle, that yields all of these pretty much effortlessly (see Lemma 2.6 in the linked paper for the idea). Any other simple proofs out there?
No comments:
Post a Comment