Saturday, January 27, 2007

A neat problem

communicated to me by co-blogger Steve Miller, which he's too busy to post. Two players, A and B, start out with $a and $b dollars, respectively -- where a and b are natural numbers. They flip a fair coin, and every time it comes up heads, A gives 1 dollar to B; for every tails, B gives 1 dollar to A. The game ends when one of the players has zero dollars (and the other one has a+b). What's the probability that player A wins?

I was able to guess the answer right away (Steve told this to me on the phone), but this is probably more luck than anything else, as these intuitions can often be misleading. In any case, an answer is worthless without a proof, which Steve tells me is not entirely trivial. Furthermore, we don't know what happens if the coin, instead of being fair, has bias p. Anyone out there know?

A harder analysis exercise

... but absolutely essential for anyone wishing to deepen his understanding of the real line:

Let C be the standard Cantor set, a subset of the interval [0,1]. One can easily verify that this set is
  1. uncountable
  2. closed [i.e., all Cauchy sequences in C converge to points in C]
  3. totally disconnected [i.e., contains no contiguous interval (a,b) ]
  4. has Lebesgue measure 0 [i.e., its "total length" adds up to 0]

If these aren't obvious, it's definitely worthwhile to verify them; it's not hard. Note that for some time mathematicians were wondering if any set with properties (1) and (4) even exists.

Put E = [0,1] \ C; that is, E is the complement of C in the interval [0,1]. Then E is an open set in [0,1]. Now any open subset of the real line is a countable union of disjoint open segments; again, if this is news to you, do take the time to convince yourself. [A glossary of topological terms may be found here, but if you're seeing these terms for the first time, it might be too early to attempt this problem.]

Thus one might reason as follows: "Every open segment comprising E corresponds to two points in C. But the segments of E are a countable collection, while the points of C are uncountable - we have a contradiction." Resolve the apparent contradiction. You will be enlightened, or your money back.

Solutions/discussion welcome in comments.

Wednesday, January 24, 2007

Maybe I should've

done this for my postdoc search?

[thanks for the note, Steve!]

An easy analysis exercise (+philosophy)

I'm (unfortunately) not teaching a class on analysis, but if I were, I'd assign problems like this (as a warm-up, of course :)

Construct a family of measurable functions whose pointwise supremum is nonmeasurable. [Answers + discussion welcome in comments.]

This very simple exercise illustrates one of the key intuitions about measurability: the only way to obtain non-measurable objects from measurable ones is by some "uncountable process" -- uncountable unions, uncountable suprema, etc. Of course, the very construction of non-measurable sets depends on the axiom of choice, which is equivalent to the well-ordering principle, which manifests itself as Hausdorff's Maximality Principle or Zorn's Lemma; these all enable (uncountably) transfinite induction.

This post was prompted by my reading up on empirical process theory (just when I thought I was starting to get a solid grasp of it, I've discovered whole new oceans) -- mainly stuff by David Pollard and Michel Talagrand. Since uniform convergence involves taking suprema over function families, great care must be taken to ensure measurability (or outer measures must be used, but these create problems of their own). I hope to post on these topics in greater depth when I have time.

Saturday, January 20, 2007

A FLAC exercise

This one will almost certainly be assigned. Please do not post answers here.

For any finite state automaton A, deterministic or not, size(A) will denote its number of states.

Construct a family (sequence) of NFA's, {N_k}, such that size(N_k) is growing polynomially in k, while size(D_k) is growing exponentially in k, where D_k is the minimal deterministic automaton equivalent to N_k.

Hint: you may use the fact that the sum of the first k primes has polynomial growth, while the product of the first k primes has exponential growth.

Monday, January 15, 2007

Transgressing the Boundaries: Toward a Non-Hegemonic Feminist Mathematical Theory

The title is an homage to Alan Sokal's witty demonstration of the vapidity of the whole postmodernism industry. I've always considered myself rather fortunate, as a mathematician, to engage in research safely outside the reach of the PC commissars. Yet every now and again the hydra thrusts one of its many heads into our pristine art; this time, under the heading of "social justice". I am not going to attempt to define justice here (I am not even sure that I can at all), but if there's one thing I'm sure of, it's that mathematics has nothing to do whatsoever with justice, or for that matter, with any aspect of the physical world. Sure -- physicists and engineers have used mathematical tools with great success to build scientific theories and neat gadgets. But math lives in its own separate Platonic world and we mere mortals can only hope for an occasional peek inside (much as no one "invented" electricity, there are no mathematical "inventions", only discoveries).

Which brings us to this pearl of wisdom (via Alexandre Borovik):

The content of mathematics is typically thought of as neutral. That is, to most, mathematics is considered a domain that is devoid of ethical-moral implications. One can use mathematics for whatever purposes one wishes, but the mathematics itself is not good or bad, it just is. If I am right and mathematics classrooms frequently teach powerlessness, then the notion that mathematics is essentially neutral needs to be revisited.

This reminds me of a Russian joke. A teacher in elementary school is explaining long division on the board. Suddenly, a KGB officer walks into the classroom and listens in for a while. During break, he walks over and exchanges a few quiet words with the teacher. When class resumes, the teacher announces, "Class, we've received directives from above: from now on, division will be done like this..."

I've been maintaining for years -- I believe it's my original observation -- that some of humanity's greatest atrocities were committed when people mixed the rational and the ethical spheres of reasoning. The rational/scientific sphere deals with measurable quantities and exact facts. It is a tool, completely devoid of ethics and morality: it can help you cure cancer or build a bomb -- your choice. The ethical/moral/religious sphere deals with questions such as, "What is the right thing to do?" When religion encroaches on matters of science, we get the Inquisition. When rationality overrides morality, we get Nazi experiments on humans.

It is a common philosophical mistake to endow inanimate objects with moral characteristics. A gun is not inherently good or evil -- it can be used to commit murder or save lives. It's a tool, like a hammer or a ruler. Only conscious sentient beings can be judged on a moral scale and labeled as good or evil.

The author of that piece -- Kurt Stemhagen -- makes two fundamental mistakes. First, he conflates mathematical constructions with their real-world applications. Second, he attempts to impose a moral rubric on an inherently a-moral realm of reasoning. (Teichmüller was a Nazi mathematician who actively persecuted Jews. Should we boycott the study of Teichmüller spaces?)

Friday, January 12, 2007

Visualizing Measure Concentration

In the course of giving my talk at Weizmann, I made the usual point about trying to visualize measure concentration:

A word of caution for those attempting to visualize "measure concentration": imagining a narrowly peaked, one-dimensional distribution is tempting but wrong. This is a genuinely high-dimensional phenomenon.

I gave the example of the uniform distribution on the binary hypercube, {0,1}^n, which certainly exhibits the concentration phenomenon (via Chernoff bounds, for example) -- yet it's as flat as they come; there are no peaks here! Someone from the audience pointed out that if one looks at the distribution of a Lipschitz random variable f:{0,1}^n -> R, then it will indeed be peaked. This threw me off a bit, as it was correct and seemed to trivialize my point, which I knew had some non-trivial content.

Having thought about this off-line I've recovered that non-trivial point. For any measure on {0,1}^n there is a function f:{0,1}^n -> R whose induced random variable will be very peaked. It's also not hard to concoct very clumpy distributions which will induce peaked random variables for a wide class of functions f. The remarkable thing is that uniform distributions produce peaked Lipschitz random variables. Conversely, if one allows badly non-mixing measures on {0,1}^n, it's easy to construct non-peaked Lipschitz rv's. The philosophical take-home message is that while it's trivial to create "orderly" random variable in specific instances, it's quite remarkable to observe that a rich collection of random variables will acually be quite well-behaved under rather general conditions.

Sunday, January 7, 2007

A FLAC exercise

This exercise may or may not be assigned in the course I'm TAing this spring, but it's a good one to work out. It definitely had me confused at first, and I was in good company.

A teacher gives you a bunch of strings over some finite alphabet and labels each string as "positive" or "negative"; we'll refer to these labeled strings as the sample S. A deterministic finite-state automaton (DFA) is consistent with the sample S if it accepts the positive strings in S but not the negative ones. Let A0 be any DFA that is consistent with S and let A1 be the automaton obtained by minimizing A0 (by this point in the course you'll have learned efficient minimization procedures). Is A1 the smallest DFA that's consistent with S?

Please don't post the answer in the comments!

NLP revisited

I almost certainly didn't do the field of Natural Language Processing justice. Fortunately, I've discovered some language-related blogs. The first two have more of a computational flavor. The last one, of a more philological bend -- called "Language Log" -- is now a permanent fixture in my "Other Interesting Stuff" section.

Wednesday, January 3, 2007

Miscellanea

If you're trying to physically locate me over the next two weeks, try room #3 in Ziskind building of the Weizmann Institute, in Rehovot, Israel -- where I'm being very kindly hosted by Gideon Schechtman. Blogging will be light due to the tight schedule (giving 3 talks: Jeruslem, Weizmann, Technion), but I'll try to report on interesting seminars that I heard (like the one today).

A quick note about the comments: they're unmoderated and no registration is required; I'm hoping that no moderation on my part will ever be necessary. It's fine occasionally to lapse into politics, but I was hoping to see more math discussions!.. Finally -- I am plain curious as to who's reading this barely-a-week-old blog and leaving comments. If you'd leave a name when commenting I'd be much obliged!

Monday, January 1, 2007

On Dictionaries and Language

This is prompted by the Weekly Standard piece on Samuel Johnson. The author, Jack Lynch, credits Johnson (among other things) for taking the trouble to define such "obvious words" as take and get -- which lexicographers before him had neglected. The segue into philosophy of language is inevitable: do words have "primitive", "irreducible" definitions? Or are dictionary entries necessarily circular, since we can only define words in terms of other words? Since much ink has already been spilled on these matters (cf. Wittgenstein, Chomsky and David Lewis), let us frame the question more operationally. Can a computer program, in principle, reason about the world (on par with humans) given only text as input and output (for learning)? Or must it necessarily have sensory perception in order to become sentient?

My training at Bell Labs, where I was most heavily influenced by Daniel Lee, leads me to conjecture that a large enough corpus of text, with sufficiently clever statistical processing, is enough to train a machine that would pass the Turing test. Supporting evidence: congenitally blind people are able to reason about colors perfectly well, without having any qualitative experience of the phenomenon. The linguist J.R. Firth would seem to agree with the possibility of "stand-alone" semantics: "You shall know a word by the company it keeps".

Taking five minutes to ponder the mysteries of language reminds me why I left Natural Language Processing, after a few brief forays. The problem is just too hard, and our current tools too primitive. I realized that until we develop more powerful mathematical tools, NLP research will be plagued by the sad fact that ad-hoc heuristic hacks tend to outperform elegant, clean, principled models. Of course, I am quite out of it as far as recent developments -- and would be happy if someone would set me straight. I know that elegant, principled models exist for document classification and text translation. I also know that if the state of the art is to be judged by Google's automatic translator, then there is, ahem, much room for improvement.

I prefer to be a producer of formal theorems and a consumer of NLP products (and judging by my list of rejected NLP paper submissions, my preference is in line with that of the community). Of course, no one can stop me from dabbling in language as a hobby, which I regularly do. Want the etymology of an obscure (but known!) Indo-European or Semitic root? Want a tip picking out a good dictionary? You've come to the right place. Regarding the latter: I can size up a dictionary in a matter of minutes, and my intuition has yet to mislead me. Always look up slang and, yes, vulgarities -- any lexicographer who pretends that certain words don't exist isn't worthy of the title. (Russian joke: "Мама, что такое жопа?" -- "Такого слова нет, сынок." -- "Странно -- жопа есть, а слова нет?..") Our final conclusion is that Russians have a joke for most occasions.