Sunday, May 17, 2009

concentration of exchangeable processes

Having thoroughly embarrassed myself in the previous post, let me shoot for something half-way redeeming.

Let X_1, X_2, ... be a sequence of {0,1}-valued random variables. We say that the process {X_i} is exchangeable if every finite-dimensional joint distribution is invariant under the permutation of indices. The de Finetti theorem characterizes the exchangeable processes as mixtures (possibly uncountable) of Bernoulli processes. Note that this is only true for infinite sequences -- there are well-known examples of finite exchangeable sequences that cannot be extended even by 1 (see Diaconis for a fascinating discussion).

Now exchangeable processes are in general not concentrated. Indeed, the measure P on {0,1}^n with P(00...0)=P(11...1)=1/2 (i.e., a mixture of two degenerate Bernoulli processes) is exchangeable but not concentrated.

However, I claim the following:

1. if P is a mixture of Bernoulli processes whose biases are bounded within
0<=a<=b<=1, and f:{0,1}^n -> R is 1-Lipschitz w.r.t. the normalized Hamming metric,
we have
P(f - Ef > t) <= exp(-2n(t-d)^2) where t < d =b-a and the bias of a Bernoulli process is P(X=1). Note that this recovers the ordinary McDiarmid bound, for which b=a, trivially.

2. if P only assigns positive probability to sequences with at most k ones,
we have
P(f - Ef > t) <= exp(-2n(t/2k)^2) ) for any f:{0,1}^n -> R that is 1-Lipschitz w.r.t. the normalized Hamming metric.

The proofs are standard and apparently not worthy of writing up -- but let me know if you try it and get stuck. Have these appeared in literature?

Monday, May 4, 2009

Tighter Hamming ball volume?

Let C(n,k) be the binomial coefficient "n choose k". We want to bound the sum of these, with k ranging from 0 to d. This is equivalent to bounding the volume of the n-dimensional Hamming ball of radius d. Denote this number by B(n,d).

A well-known bound (appearing, for example, in the Sauer-Shelah-VC lemma) is

B(n,d) <= (en/d)^d .

We are pretty sure that it can be sharpened, for example by writing a (1/2) in front of the RHS. (We actually think it can be sharpened by better than a constant.)

Thus has to be known. Anyone have a reference? Thanks in advance!