## Wednesday, February 7, 2007

### Sampling Lipschitz functions

Fix a positive integer n and let F_n be the set of all functions
f:{0,1}^n -> [0,n]
that are 1-Lipschitz with respect to the Hamming metric. As a reminder, for two sequences x and y in {0,1}^n, their Hamming distance d(x,y) simply counts the number of indices in which they differ. The Lipschitz condition just means that
|f(x) - f(y)| <= d(x,y)
for all x and y in {0,1}^n. It's obvious that F_n is a compact, convex polytope in a finite dimensional space (that space being R^(2^n) ). How would you sample a function uniformly at random from F_n?

I have a method that I'm pretty sure is correct, but I'd love to generate some discussion...

Leo said...

I probably spoke too quickly. If I'm lucky, my method samples a function uniformly from the extreme points of F_n, and I don't see an easy way of adapting it to a sampler from F_n...

Anonymous said...

Markov Chain Monte Carlo?

DAHbKA.

Leo said...

Well, having thought about this a bit more, I'm starting to think maybe this is a hard problem. We know that any compact convex polytope V in R^n is the convex hull of its (finitely many) extreme points; furthermore, any point in V can be expressed as a convex combination of some (n+1) extreme points [can't find a reference for the latter -- is this Minkowski's theorem?].

This gives a natural way of sampling points in polytopes -- IF you can list all the extreme points. But what if you can't? How would you set up an MCMC sampler?

As an aside, we'd like the sampling algorithm to be polynomial in the polytope's dimension. In the case of my F_n example, that would be 2^n, which is ok. Any ideas?

Maxim said...

[can't find a reference for the latter -- is this Minkowski's theorem?]

This is Caratheodory's theorem.

Leo said...

Thanks, Maxim, indeed it: is. Any thoughts on the sampling question?

Anonymous said...

"But what if you can't? How would you set up an MCMC sampler?"

What you say is that MCMC suits when the state space is countable and it is not the case here (for unbounded n)?

That's funny, since MCMC may be easily untilized for sampling from bounded continuous domains and from countable ones, but apparently not from intermediate cases, unless I'm missing something.
Hints? I have no idea how to use the Lipshits condition.
DAHbKA.

Leo said...

No no, I know one can do MCMC on uncountable state spaces. My point was this: I can efficiently sample from a convex body if I know all of its extreme points. But what if I don't?

Here is a simpler question. Let B_n be the Euclidean unit ball in R^n. How do you sample uniformly from B_n? Ok, here's an obvious way: generate points in [-1,1]^n and reject them if they fall outside of B_n (verifying the latter is trivial). But there's a problem with that approach: the volume of B_n becomes exponentially small compared to the volume of [-1,1]^n, so for large n, you have to wait a very long time to get a point in B_n. Is there an efficient sampler for B_n? How would one set up an MCMC to do this? I would love for someone to educate me on this, as I'm pretty clueless...

Daniel said...

I've written a long comment and lost it. Anyway, I think I spoke some bullshit in the previous comment, but now I think I know how to solve the Bn problem.
Your first solution is Crude Monte Carlo and it is clearly unefficient since Bn is a rare event and it will take an exponentially long time to sample from it. The MCMC, IMHO, is very similar to that constructed for Knapsack problem in Jerrums work
The state space is all the vectors in Bn. The transitions are as follows:
1. Let X be the current state.
2. Choose u.a.t i and flip i-th bit of X. Let Y denote the resulting vector.
3. It Y is in Bn - accept it as the next state. Reject otherwise.

The chain is ergodic and therefore converges to its statiionary distribution which I think is uniform. (At lest in Knapsack it is).
The problem is bounding the mixing time till convergence. In the case of Knapsack it can be shown that the time is polynomian in n (Prop. 12.1).
P.S I have some simulation and sampling related problems in my thesis, which can be discussed here if you don't mind.
DAHbKA.

Leo said...

I don't mind at all -- I'd love to learn more about how MCMC algorithms work in practice (as opposed to just that "they do" in theory)! Please post!

Daniel said...

Great! Here it comes.
The Cross Entropy Method is a recent heuristic for solving hard combinatorial optimization problems (such as traveling salesman, MaxCut etc.) and estimation of rare events probabilities. Without going into details it operates as follows:
Let S(w) be some function defined over a large (discrete) set W. The goal is to find w\in W that maximizes S. Let f(w;v) be some parametric density defined on W, where v is the vector of parameters and v is the mean of the density. The goal is to find an optimal (in some sence) vector v. The algorithm proceeds as follows:
1. Let v0 be some initial vector of parameters. We sample w_1,w_2,...,w_N from f(w,v0) and evaluate there performances
S_(w_1),...,S(w_N). Let w^1,..,w^k denote the set of the elite (better valued) samples.
2. Update the parameter v as the maximum likelihood estimator based on the elite samples w^1,...,w^k. Namely,
p_new = \sum_{i=1}^k {w^i}/k
3. Repeat unless some stopping criterion is met.

I'd like to investigate the convergence properties of the method (if such properites exist). As for the beginning, if we define the process X_n such that
X_n ~ p(*|v_n) can we say X_n is a submartingale (or supermartingale if dealing with a minimization problem)?

Leo said...

Даньчик, I didn't quite follow your MCMC algorithm for sampling from B_n. Can you explain it in more detail? Many thanks!! -L

Daniel said...

Oops! Are we talking about Bn a Hamming ball or a general ball in n-dimensional real space? That is, what is an element in Bn? Is it a binary vector of length n or a vector of reals between -1 and 1?

Leo said...

I was talking about the Euclidean ball -- B_n is the set of all points in R^n whose L2 norm is at most 1.

[Sampling from the Hamming unit ball is easy indeed :)
]

Daniel said...

In this case the Hit and Run sampler may be useful. Some details here
(slide 16)

Anonymous said...

The best technique I'm presently aware of for sampling from the interior of a convex body is described in this very readable paper:

http://research.microsoft.com/users/lovasz/vol4-focs-tr.pdf

--Carl Feynman

Leo said...

Many thanks to Carl Feynman (any relation to Richard Feynman?) -- I'm reading the paper by Lovasz and Vempala and it's quite illuminating.

Anonymous said...

Here's how to sample from the unit L2 ball in R^n. The distance of the point from the origin and the direction of the point from the origin are independent, so we can generate them independently. The radius can be sampled by choosing uniformly from the interval [0, 1] and raising to the nth power. The direction can be generated by constructing a vector whose elements are drawn i.i.d from a standard normal distribution. Rescale the resulting vector to have the desired radius and you're done.

--Carl Feynman

Anonymous said...

Richard Feynman is my father.

--Carl Feynman

Leo said...

I'm humbled by the simplicity of that L2 ball sampling algorithm. It's immediately generalizable to sampling from ellipsoids -- just set the covariance of the Gaussian to the quadratic form defining the ellipsoid, right?