## Sunday, December 30, 2007

Here's a delicious paradox, due to Sylvain Cappell. Consider the class of the computable real numbers -- i.e., those reals for which there is a Turing machine that is guaranteed to produce the nth decimal digit in finite time. This is a well-defined and countable subset of the reals; let's denote it by T.

Being countable, the elements of T may be enumerated and arranged in an array of digits where the (i,j)th entry is the jth digit in the expansion of the ith element of T. You can probably guess what's coming next -- we're going to apply Cantor diagonalization to this list and obtain a new real number r, not in T.

It seems we have a nasty paradox on our hands. On the one hand, Cantor diagonalization is a well-defined, constructive, algorithmic process -- so constructing r out of T should be a cinch. On the other hand, by assumption, T is an exhaustive list of all the computable numbers -- so r should not be computable. So which is it -- is r computable or not?

Discussion is welcome in the comments. I'll close by that I would've loved to use this question on my automata theory undergrads.

## Thursday, November 22, 2007

### I suppose I'm flattered

that a professional philosopher would quote an entry from this blog and deem it worthy of rebuttal. I'm too busy/tired to read the piece, much less comment on it. A cursory scan, however, suggests that

1. I stand by my original claims
2. Stemhagen is kinder (i.e., less nasty) to me than I was to him

This blog has been kind of on hold for a while. I know I promised people research tips, but I'm stuck on a problem and feel myself unworthy of dispensing research advice. Especially with people like Tao in the blogosphere (he has plenty of high quality advice for mathematicians at all stages of their careers).

So... if anybody is itching to re-start that math-education-ethics-platonism discussion, now's your chance. I'll check back in a day or so.

Update: A co-blogger had a stronger stomach than me, and read enough to find the following nugget:
"Some argue that mathematics class ought to more explicitly consider those on the socio-economic margins and our social system's role in this marginalization."
I'm suddenly reminded why I was nasty to begin with.

## Wednesday, November 7, 2007

### Misc.rand.various.sundry

This is cool.

This is not.

As for my last two papers (uploaded last night) -- you decide.

## Wednesday, October 24, 2007

I seem to have resolved, affirmatively, an open question I'd posed in Sec. 6.5.3 of my thesis -- namely, whether for (just about) any mixing matrix there's a measure achieving those mixing coefficients. I haven't written up a proof yet, but looks like it'll make a cute little result. Here's a very short writeup (no proof); let me know if you want to hear more.

Got an interesting conjecture regarding mixing coefficients of random fields. I can see this one having far-reaching implications but unfortunately see no proof...

[Update: that random field mixing conjecture is wrong. Maximal graph degree is too crude a measure of connectivity -- seems like some sort of path counting is necessary...]

## Sunday, October 21, 2007

### Bemused metablogging

So Scott calls Al Gore "the Churchill of climate change" -- this despite the fact that the fact that Gore's science is so shoddy that a British High Court ruled that his film can't be shown to schoolchildren. I would love for this to be a bad case of a malfunctioning irony meter (mine), but I fear the worst. (Was Churchill also a shameless hypocrite?)

Then sneaky old Aaronson goes ahead and links to a something this, thus ensuring that readers like me will continue to return and wade through the standard liberal tripe for the occasional (well, ok -- frequent) gem.

[Update: Seems I've been sloppy, both in facts and in rhetoric. The above was meant as a friendly jab (didja notice the not-so-veiled compliment?) though I can see how it might be taken for an ad hominem attack. Also, as Kenny and others point out, my claim about the film not being allowed to be shown isn't accurate. See here for more information.]

## Saturday, October 13, 2007

### How to Win a Nobel Peace Prize

Glad to see the prize regain its old prestige. Can't we separate the prizes in category #1 from #2 and #3?

Update: C'mon, people. This is an article bashing Al Gore. And the whole Peace Prize sham. Isn't anyone going to get pissed off enough to react? Or have I scared all those types away a long time ago?..

## Wednesday, October 10, 2007

### More math psychology

Did I really let almost a month elapse without posting? Must be because my life is so wild and exciting. Um, yeah.

Here's a question that's been occupying me for a while -- non-mathematicians' perception of probability. What does a man on the street mean when he says a coin has a 50% chance of landing on heads?

The most likely interpretation is frequentist: if you flip the coin a whole lot of times, you'll see heads about half the time, on average. (How many times is a lot? What does on average mean here? Read my thesis -- at least the section on the Law of Large Numbers.)

But what about one-time events? What does it mean that there's a 30% chance of rain tomorrow? (Tomorrow will only happen once, so any talk of averages is meaningless.) Heck, what does the meteorologist mean by that probability?

Cosma helped me resolve the latter quite satisfactorily in private correspondence (so satisfactorily, in fact, that I feel dumb ever having asked the question). But I turn to the readers:

1. can you make rigorous mathematical sense out of the meteorologist's 30% chance of rain prediction?

2. can you ask your non-mathematician friends what that prediction means to them?

## Tuesday, September 11, 2007

### Is There Anything Good About Men?

I know I'm behind on posting about the things I promised I'd post about; rest assured that I'm behind on work stuff as well.

So in the meantime, read this fascinating piece by Roy F. Baumeister. There's too much incisive analysis in there to give a brief sound bite; you'll just have to read the whole thing. Discussion welcome in the comments.

I like to tell myself that if people didn't send me such pointers I'd be 50% more productive...

## Saturday, September 1, 2007

### Psychology of Mathematical Reasoning

Recently, I've found myself needing to explain what it is that mathematicians do. Sometimes I say, "we add really big numbers". You'd think people would laugh (or at least give an incredulous look) -- but how many times have you heard a layman casually comment how "math people" deal with "numbers"?

Actually, number theory is a nice vehicle for giving layfolk a taste of what math is about. Everybody knows about naturals and primes (and if they don't, and you're a mathematician who's been put on the spot, it's something you can explain in under a minute). So I tell people, look: there are obviously infinitely many naturals (for any number there's always a bigger one) and there are also infinitely many primes -- but this latter fact is less obvious and requires proof.

Here is where I've run into unexpected troubles. People have no problem accepting the infinitude of the naturals, but what they have trouble appreciating is that it's not obvious that the primes are infinite. "C'mon -- there are infinitely many numbers, so of course there are infinitely many primes!" I've heard this response from more than one person. "Now wait a minute" I protest. The primes are a subset of the naturals, so a priori, they have every right to be a smaller subset. "OK, gimme an example of a finite subset of the naturals". I'm happy to provide the example {1,2,3,4,5}. "Yeah, but you've constructed it as a finite set, so it doesn't count" is the sort of reply I get.

What seems to be happening is that to an untrained intuition, any subset of the naturals defined by a property without explicit bounds appears to be obviously infinite. Has anyone else encountered this phenomenon? Alexandre? Can my mathematician readers try this out on some non-math friends (no need to obtain signed consent forms) and let me know what you find?

Finally, does anyone have a simple example of a "nontrivially finite" subset of the naturals? That is, a set defined by a (simple!) property P that makes no reference to explicit bounds, yet P is provably finite?

## Friday, August 24, 2007

### Extreme-point fallacy

First, some administrative notes. Thanks to the readers for the feedback. I got requests for research advice and explanations of techniques from my thesis; I promise to post on both topics shortly.

Do take a look at Daniel's problem in this comment thread. It combines the features of being theoretically interesting and challenging while also being of great practical importance. If anyone has thoughts or literature pointers, please do share them with me or Daniel!

And here's a little teaser to see who's awake. Let D be a subset of R^n of the nicest possible kind: a finitely generated compact convex polytope. Thus, D is nothing more than the convex hull of finitely many points. Let f and g be two convex functions mapping D to R. A mathematician (let's call him Hermann) would like to prove that
f(x) <= g(x)
for all x in D. Now Hermann reasons as follows. He knows (from reading Papadimitriou and Steiglitz, for example) that an affine function achieves its extreme values on the extreme points of a convex domain. From this he concludes that a convex function achieves its maximal values on the extreme points of a convex domain (can you also make this deduction? Hint: the epigraph of a convex function lies above its derivative.). "Aha!" he exclaims. "All I need to do is check that f(x) <= g(x) on the extreme points of D. But there are finitely many of these -- they're just the corners of the polytope!"

You'll agree that verifying f(x)<=g(x) on the finitely many corners of D is, in general, a much simpler task than doing this for all of D. Is there something wrong with Hermann's reasoning, however?

[This is in no way meant to imply that any mathematician named Hermann ever claimed anything of this sort!]

## Friday, August 17, 2007

### Blog update & open thread

After months of cluelessness, I finally figured out how to make recent comments appear in that tab you now see on the right; many thanks to the anonymous commenter! No more digging through old posts to see who's left a comment (I've occasionally discovered comments on several week old posts.) Any idea how to make that "recent comments" list longer than 5 (blogger doesn't seem to give me that option)?

Following Daniel's suggestion, I'll make this an open thread. Post your problems (open or solved), tell some jokes, start flamewars -- it's all good! [I've never had to remove a comment or "moderate" in general, but don't push me...]

A CMU undergrad (CS) asks for some generic research advice and I promise to devote a post to this shortly.

## Wednesday, August 15, 2007

### Trip report + misc.

Back in Israel; it's good to be home. Now that the travels are over, I can look back and say "it wasn't that bad" -- but that's not what I would've told you when I was within a hair of getting arrested for refusing to give up my toothpaste at an airport security screening.

My submission got a "distinguished contribution" award at MLG (in lieu of "best paper" since these are extended abstracts). The analysis and probability workshop at Texas A&M was very intense, reminding me yet again how little of my so-called field I know.

"Meaty" content has been sparse here but I have a good excuse. I'm trying to switch gears from a problem-solving to a paper-writing mode. This requires less creativity and more discipline, but is absolutely indispensable -- both for one's career and for keeping oneself honest.

There's no shortage of high-quality, educational and entertaining material on the web (check out the links on the right), thus I feel no pressure to "deliver" to any readership. Speaking of which, if you're a regular reader of this blog, how about leaving a comment? Just so I know the blog has regular (or any) readers. [Not that a lack of readers has ever discouraged me from writing.] Also, feel free to use the comments to make suggestions regarding what you'd be interested in seeing here. Well enough chit-chat. Back to work.

## Wednesday, August 1, 2007

I'm writing from Florence, Italy, where I'm attending the Mining and Learning with Graphs workshop. The paper I'm presenting is the Universal Kernel one. On the off-chance you're reading this and attending the workshop, say hi!

Then it's off to Texas for Concentration Week, namely: "Probability Inequalities with Applications to High Dimensional Phenomena". My talk: Obtaining measure concentration from Markov contraction. Slides will be online soon. [I'm intentionally blurring the distinction between Leonid and Aryeh (they're really the same name); hopefully, this won't cause confusion. Rule of thumb: if we're speaking English, use Leo. If we're speaking Hebrew, Aryeh. If we're speaking Russian, you know what to call me.]

## Tuesday, July 24, 2007

The multiplicative Chernoff bound states that if X is the average of n independent, 0-1 random variables X_i, each with mean p, then

P(|X-p| > tp) < 2exp(-npt^2 / 3).

In other words, the empirical average of independent binary random variables is exponentially concentrated about their true mean. The probability that we're off by a muliplicative factor of t is subgaussian in t. There is a catch, however: rare events are difficult to estimate. Thus, if p is tiny, the bound becomes quite loose. Indeed, if you're flipping a coin whose probability of heads is 10^-6, you can easily see a thousand flips without a single head.

But what about the following "trick" to beat Chernoff? Since X_i=1 is a rare event, let us define Y_i = 1-X_i. Now Y, the average of Y_1,...,Y_n, has expected value p'=1-p, which is large if p is small. So instead of obtaining a loose bound for X in terms of p, we obtain a good bound for Y in terms of p' -- yet X and Y are related in a very simple way. Does this trick really work or have I cheated somewhere?

## Tuesday, July 17, 2007

to Weizmann and slowly settling in. Wrapping up some old projects and excited to start some new ones. Stop by Ziskind room 302 and say hello. FYI: my name is אריה in Hebrew.

## Friday, June 29, 2007

### On the Origin of Languages

Re-reading the intriguing book by Merritt Ruhlen: On the Origin of Languages: Studies in Linguistic Taxonomy. The basic premise is that contrary to widespread belief, the tools of comparative linguistics can be applied beyond the Indo-European family to reconstruct the roots of a world proto-language. Though it sounds far-fetched at first, Ruhlen makes a compelling case for his methodolgy, and gives some world-etymologies that I can't resist from reproducing below. It gives me particular satisfaction to see these ancient roots manifest themselves in the languages I read, write and speak: Russian, English, Hebrew, Latin. The asterisk indicates an unattested form.

KAMA 'hold (in the hand)'. Ruhlen gives the Proto-Afro-Asiatic root *km, from which we get the Arabic kamasa 'seize, grasp'. Could the Hebrew חמש 'five' be related (as in, the five fingers of the hand)? The Indo-European cognate is *gemo, which appears in Russian as жму 'I press'.

My favorite: MANA 'to stay (in a place)'. Proto-Afro-Asiatic: *mn, which manifests in Hebrew as אמן. This root means 'true, enduring', and is borrowed by many Indo-European languages as amen. Of course, the I-E family has this root occurring natively as well -- as the Proto-IE *men. The Latin manere will be more familiar to English speakers as remain.

Another favorite: MENA 'to think (about)'. Hebrew: מנה 'to count'; English: mind, mental; Russian: мнить.

I'll close with the colorful PUTI 'vulva'. Hebrew speakers will immediately recognize this as פות, via the Proto-Afro-Asiatic *pwt 'hole, anus, vulva'. Speakers of Romance languages might be pleased to learn that when they curse a woman as puta(-na), they're invoking an ancient root, dating back tens of thousands of years!

A final note, and an appeal to my more professional linguist readers. Ruhlen writes: "the Indo-European family has been established beyond doubt," and this has been my belief ever since I began to amateurishly dabble in comparative linguistics. However, I recently had an argument with a computational linguist/computer scientist/mathematician who claims that the IE-family is "merely" a hypothesis, and a rather controversial one at that. Does anyone know of a reputable linguist who doubts the common origin of the so-called Indo-European languages, and questions the basic structure of reconstructed Proto-IE?

Update: More world etymologies are available online.

Update II: To be fair and balanced, I'm linking to a harsh critique of Ruhlen and his methods, with a hat tip to Cosma.

## Thursday, June 14, 2007

### Shameless self-promotion

Some good news on the publications front. My paper with Kavita Ramanan has been accepted (with minor revisions) to the Annals of Probability. If you just want the main idea (actually, a much simpler proof of a more general form of the main result), this is the paper to read.

Another recent acceptance is my Universal Regular kernel extended abstract, to appear in MLG'07. It's a short 4-page writeup, and if you can resolve the issue of computing K_n, I guarantee you fame and fortune.

I'm hanging out at the FCRC conference (Mehryar is presenting our Rational Kernels paper with Corinna at COLT). So if you're around, find me and say hi, and most definitely come to Mehryar's talk on Thurs. at 2:40. I'll try and blog about the conference a bit later, but we always have Scott to count on.

## Monday, June 11, 2007

### Thesis online

After abusing our department coordinator's patience beyond all common decency, I've stopped making revisions on my thesis. I know, I know -- a week from now I'll casually glance at it and see something I'll want to change. But major OCD notwithstanding, enough is enough. I'm putting it online for public perusal; don't all rush in to download it all at once now. As always, questions and comments are more than welcome.

## Saturday, June 9, 2007

### Idea for Sci-Fi story

How can you tell if the reality you experience is "real" or is just a giant computer simulation? Philosophers realized long ago that of course you cannot; Hofstadter and Deutsch make the a posteriori obvious point that the question itself is meaningless. Any physical process may be viewed as a computation and therefore a "simulation".

But what if we allow the possibility of simulator malfunctions? Address errors, memory leaks, unknown-error-must-shut-down type things. What would it feel like to be in a simulation that suddenly displayed such artifacts? Parts of your universe are working fine as before, but you might locally observe very strange discontinuities and irregularities.

There are sophisticated theories of spacetime defects that I lack the mathematical apparatus to understand (any quantum gravitists want to help out?). Might any of these defects be explicable as computer bugs in the universal simulator? Could one at least get a decent sci-fi story out of this? I'm sure this vein has been already explored -- can anyone point me to a good story? Anyone up the the task of writing one?

## Tuesday, May 29, 2007

### Misc. update + possible flamewar

Posting has been and will be sparse over the next month or so as I transition to my new location (starting a postdoc at Weizmann).

What I'm working on: a tantalizing decoupling conjecture and a concentration bound for adaptive Markov processes (with Anthony Brockwell). Ask me about these.

Thoughts on leaving Pittsburgh: it's a shame I only met some of the people so late. Seems like in my last months at CMU I made a whole slew of new friends and colleagues -- anywhere from fellow mountain bikers (and I've got fresh scars to prove that) to fellow mathematicians, philosophers, and shmoozers. Where have you guys been for the past 5 years? A better question is where have I been: stuck in my office. I'm not sure there's necessarily a moral here (if I'd done more shmoozing and less work I'd probably still be stuck in that office) -- but it's always sad to see what one has been missing out on.

And now for the flamewar: Cosma and I were discussing The Bell Curve over a beer (or two... or three...). Now smearing Herrnstein and Murray's book as pseudo-scientific racist drivel is a favorite past-time of the Left (and not having read it isn't much of a deterrent). Cosma points out that conservatives can also pile on. In 2005, Murray wrote a lengthy and copiously documented rebuttal (well, more like a synopsis of the debate that their book had been generating for 11 years). Two must-read books for all equality-across-all-groups ideologues are Steven Pinker's The Blank Slate and Nicholas Wade's Before the Dawn.

There is no doubt that free scientific inquiry is severely curtailed on certain topics. Just try getting a grant to do climate research if you dare question anthropogenic global warming. The Larry Summers affair illustrates that even "mild, speculative, off-the-record remarks about innate differences between men and women" can get a university president fired. Yet differences between ethnic groups and the sexes do exist as a matter of verifiable empirical fact (please take the time to read Pinker and Wade before calling me names).

Once again, I'm only too glad that the "controversy" generated by math is of the easily dismissed crackpot type, not the type that costs one his career.

## Monday, May 21, 2007

### Serious attempts at P?=NP ?

Here's a question I'm hoping my readers will help out with. It seems that every week someone comes out with a "proof" that P=NP, and only slightly less frequently that P!=NP. Most of these are written by amateurs who don't even understand the problem.

Have there been any attemps by reputable mathematicians to resolve the issue? Lindenmann had produced several flawed proofs of Fermat's last theorem a century before Wiles got it right, and he was certainly no amateur. Does anyone know of any credible proof attempts, with subtle, nontrivial mistakes?

## Wednesday, May 16, 2007

### Student projects

The semester is over, the grades should be in by now, and most of my students from FLAC are graduating. One of the components of the course was a research project, during which the instructors mentor the students on an individual basis. This has been quite a rewarding experience, since I managed to get four of my students interested in some deep and fascinating problems in automata theory, with connections to my own work. I note that all the students I mentored did a fine job and a couple of them taught me new things. But in this post, I'd like to showcase the projects with the strongest connection to automata theory and machine learning.

Jeremiah Blocki is that brave soul who took on the notorious problem #3. First, here is that long-overdue writeup where I define the universal regular kernel. In his paper, Jeremiah gives closed-form expressions for K_n(x,y) for short x and y, as well as proving some simple lower bounds on the complexity of K_n.

Vinay Chaudhary became interested in my work with Corinna Cortes and Mehryar Mohri on learning piecewise testable languages with the subsequence kernel. Aside from understanding our somewhat involved word-algebraic proof of linear separability, Vinay had to learn a whole bunch of subtle machine-learning notions, essentially on his own. Having gained a command of margins, support vectors, embeddings and kernels, he embarked on an investigation of the empirical margin of piecewise testable languages. Vinay produced a excellent piece of research, with some tantalizing leads.

Matthew Danish considered a problem that I'd attempted many moons ago and put aside -- namely, one of characterizing the languages that one obtains by taking finite Boolean combinations of modular n-grams. (A modular n-gram is the set of all strings in which a given contiguous substring occurs k mod n times.) Matt also had to master abstruse concepts such as syntactic monoids and morphisms, and produced a solid paper.

Jonah Sherman decided to aim high and attempt the star-height problem. I remember how mesmerised I was by this problem when I first encountered it some four years ago. When Jonah had asked me about its importance, I replied that if we can't answer such natural questions about regular languages then we are in dire need of better tools! That was good enough for him, and he dove into some rather dense semigroup theory, even rediscovering the rather nontrivial result that all languages recognized by finite commutative monoids have star height of at most one. Impressive work done by a college junior, check it out!

## Thursday, May 10, 2007

### Joined the club

This morning I defended my thesis and have been promoted from mere Leo to Dr. Leo, thus upgrading this blog to the status being run exclusively by PhD's. The quality of the posts should improve accordingly -- stay tuned!

## Tuesday, May 8, 2007

### Riesz norms

Let be a vector space endowed with a norm . A norm is called absolute if for all , where is applied componentwise and monotone if whenever componentwise.

Norms having these properties are also called Riesz norms; the two conditions are equivalent for finite-dimensional spaces (see Horn and Johnson).

What about infinite-dimensional spaces? Does anyone have an example of a normed where the norm is satisfies one of the conditions but not the other? What about a function space?

For I think I have a proof that the two conditions are equivalent (by a handwavy appeal to Lebesgue's Dominated Convergence theorem). For function spaces, I suspect there's a counterexample. But this is all random 4am musing -- so please catch me if I'm disseminating lies!

## Thursday, May 3, 2007

### Total variation revisited

This is a follow-up on my earlier post on the total variation distance. As I already mentioned, my visit to Stanford and Berkeley were immensely useful, not least because of an opportunity to meet with the experts in a field to which I aspire to contribute. In that earlier post on total variation, I gave some characterizations and properties of TV, fully aware of the low likelihood that these are original observations. Sure enough, the relation

has been known for quite some time; see the book-in-progress by Aldous and Fill, or the one by Pollard.

The relation

also seems to be folklore knowledge; I have not seen a proof anywhere and give a simple (non-probabilistic) one here (Lemma 2.6). Amir Dembo suggested that I re-derive this in a probability-theoretic way, via coupling. Here it is.

Recall that if and are probability measures on then
,
where the infimum is taken over all the distributions on , having marginals and , resp., and the random variables are and are distributed and . Any such joint measure on is called a coupling and one achieving the infimum is called a maximal coupling.

Applying this to our situation, let us define the random variables . Let be a maximal coupling of and , and define similarly for and . Notice that is a (not necessarily maximal) coupling of and . Then
.

## Thursday, April 26, 2007

### עם ישראל חי

A major mazal tov to my co-bloggers Aaron and Steve on the birth of their sons. Other friends of mine with recent births are Anat & Oren and Tamar & Maxim. In the touchy-feely, lovey-dovey spirit of a baby post, can you guys link to some ridiculously cute baby pictures?

## Sunday, April 22, 2007

### Breakdown of discrete intuition

Most of the classical inequalities -- Jensen, Hölder, Cauchy-Schwartz -- work for integrals just as well as for sums. In fact, it's best to think of these as integral inequalities with respect to a positive Borel measure (which includes the Lebesgue and the counting measures as special cases, and subsumes both sums and integrals). This can keep our intuition from being excessively taxed; I, for one, prefer to reason about sums and carry that intuition over to integrals. Are there instances where that intuition can fail?

Indeed there are, as Michael Steele shows us in Exercise 8.3 of his book. In part (a), we're asked to show that for all nonnegative sequences , we have
(the solution is given in the book, but readers are invited to attempt this one cold). In part (b), we're admonished against the careless conclusion that for (integrable) , we have

-- does anyone have an example where the latter fails?

A heuristic to guide one's intuition is the following principle of Hardy, Littlewood, and Pólya, called "homogeneity in ." That is, one considers as a formal symbol and checks the order to which an expression is "homogeneous in ." In this case, the lhs is homogeneous of order 2 while the rhs is homogeneous of order 3. According to the heuristic, this incompatibility destroys the integral analog. Anyone have more such examples where the integral analog fails? Could there be a general theorem lurking around?

## Tuesday, April 17, 2007

### I wasn't going to

touch the tragic events at Virginia Tech with a ten-foot pole. People much better informed and more articulate than I are all over this; go to Instapundit for updates and spot-on commentary. However, being in the academia, I feel an obligation to occasionally shout out against the insanity that has become ivory tower conventional wisdom.

Just yesterday someone was in my office telling me how it's impossible to get guns in Japan, and how even the criminal gangs have to manufacture special-purpose bats. So much for that tripe.

I have no desire to turn this into a gun control flamewar; if you're going to comment to the effect that more gun control could've prevented this tragedy, at least entertain me by explaining how it's Bush's fault (and bonus points if you manage to link this to the Israel lobby).

I'm writing about the general powerlessness and impotence fostered by a paternalistic government and eagerly accepted by the cud-chewing masses. Read this article by a Virginia Tech grad student. Money quote:
I am licensed to carry a concealed handgun in the commonwealth of Virginia, and do so on a regular basis. However, because I am a Virginia Tech student, I am prohibited from carrying at school because of Virginia Tech's student policy, which makes possession of a handgun an expellable offense
I am in the same boat. I have a concealed carry permit, have had extensive safety and tactical training, and am even an instructor. Not that this is of any use on campus, where firearms are limited to the omnipresent and omnipotent police. Every time a student gets mugged or assaulted (every month or so, on average), campus police send out a report, with the helpful advice "If confronted by an assailant, don't resist. If he wants your wallet, purse or backpack, give it up." Not a word about carrying even a sub-lethal weapon such as pepper spray. Just give the bad guy what he wants and hope for the best.

Tragedies like this can be prevented if ordinary citizens are allowed to defend themselves instead of being infantilized. A Virginia Tech professor told that student whose article I linked to, "I would feel safer if you had your gun." In a similar vein, my Rabbi has specifically allowed me to carry my pistol in his house, even on Shabbat (when ordinarily carrying items like keys and a wallet is prohibited). As long as idiots like this continue to be the voice of campus officials, we can expect more attacks on soft targets such as schools -- whether isolated acts of insane individuals, or planned terrorist attacks.

[Update May 11, 2007] At least I wasn't kicked out of my school for expressing these views...

## Monday, April 16, 2007

### Fear and (self) loathing in Pittsburgh

... and I'm not even talking about my thesis-writing, which is somewhat behind. I am talking about J. Michael Steele's book, The Cauchy-Schwarz Master Class: An Introduction to the Art of Mathematical Inequalities. As someone who's made inequalities his bread and butter over the past couple of years, I figured this would be mandatory reading. Even with Steele's enlightening (and wryly humorous) explanations, this is definitely not light reading. Many of the challenges -- even with the hints -- are pretty darn hard, and I'm terrified to think how I'd approach these without the hints. Then again, there's a reason why some of these inequalities have names like Cauchy, Schwartz, and Hilbert attached to them... In the words of my officemate, "I'll become a better person after working through this book." Probably true for most of us.

## Sunday, April 15, 2007

### A clique bound on coloring numbers?

We assigned the problem of showing that 3COLOR is NP-complete, meaning that unless P=NP, there is no polynomial time algorithm to determine whether a given graph is 3-colorable. The classical reduction is from 3-SAT, via some clever gadgets.

One student had the idea that clique numbers could be used to resolve graph coloring questions. This is partly right: if a graph contains a K4 (a 4-clique), it is certainly not 3-colorable. What about the other direction?

Well, let's put it this way: testing whether a graph contains a K4 is a polynomial-time procedure (the brute-force search is O(n^4)). So either we've just proved P=NP, or... there exist graphs that don't contain a K4, yet are not 3-colorable. Can anyone exhibit one?

## Wednesday, April 11, 2007

### A neat analysis problem

I'm in thesis-writing crunch-mode, and though there's lots I could blog about, I simply don't have time. Take a look at this problem posted by Guy Gur-Ari, a student at the Hebrew University in Jerusalem:
Let f:[0,1]->R be a real function such that f has a limit at each point.
Does f have at least one continuity point?
It's definitely a good one to work out (try it before you go to his blog for the solution), and check out the rest of his blog while you're at it!

Update: on the topic of analysis, I ordered and am skimming Counterexamples in Analysis -- a must-read for any student of mathematics, though to my relief I seem to already be familiar with most of these. Here's a good one: give an example of a nonconstant function f:R->R that is periodic but has no smallest period. Anyone?

## Monday, April 2, 2007

### חג כשר ושמח

No need to be alarmed -- just a standard Passover greeting. Blogging will continue with absolute regularity.

## Sunday, April 1, 2007

### Contraction bounds spectral gap?

Let A be a column-stochastic matrix. Compute its (complex) eigenvalues, take absolute values, sort in decreasing order, and let lambda2 be the second value on the list (the first is always 1, by Perron's theorem). Lambda2 is (closely related to) the spectral gap of the Markov kernel corresponding to A.

Let theta be max TV(x,y), where x and y range over the columns of A and TV(x,y) is 1/2 the L1 distance of the two vectors x and y. Theta is also called the (Doeblin) contraction coefficient of the Markov operator induced by A.

It's easy to see that lambda2 and theta are numbers between 0 and 1. Here is a simple example where lambda2=0 while theta=1:

0 0 0
0 1 1
1 0 0

It seems that we always have lambda2 <= theta, but I don't have a proof of this. I also can't imagine that I'm the first to observe this -- can someone point me to a reference? Maybe suggest a proof?

Update: thanks to David Aldous and Cristopher Moore for telling me how to prove this (well-known) fact. I won't elaborate, since nobody else seems to care, but if anyone does, let me know and I'll post a proof. I'd be thrilled to see an elementary proof (i.e., one not relying on Perron-Frobenius), so consider that a challenge!

## Saturday, March 31, 2007

### A challenge to the AI-deniers

Forget global warming and intelligent design. The real debate is whether or not (strong) AI is possible. Now this isn't much of a debate for myself or most people I encounter in my academic circles. We read Dennett and Hofstadter while still in diapers, and use "dualist" as some sort of slur. Of course, occasional dissent does creep in. At a recent openhouse for newly admitted PhD students, I was talking to several colleagues and was surprised when one -- a respected machine learning theorist -- was willing to stick out his neck against AI. Without quite committing to dualism, his argument (if I recall correctly) was along the lines of Penrose's; if I may paraphrase, consciousness is just too freaky to be explained by classical mechanics, and so must be swept under the quantum gravity rug. My friend Gahl (I surmise from our conversation) is basically being indoctrinated to reject strong AI in her freshman class.

So, I thought I'd use this space to give a strong AI opponent (or several) an opportunity to defend their views. Tell us why machines will never think or be conscious. Is it the missing ghost? Something magical about neural tissue? A mis-application of Godel's theorem?

Have I caricatured your stance? Here's your chance to set the record straight!

## Monday, March 26, 2007

### Is math tautological?

Scott throws the following teaser:

One example is the surprisingly common view that "all mathematical propositions are tautologies," and therefore can’t convey any new information
and of course I can't help but take the bait. As you surely recall from this discussion, I'm a firm Platonist:
Pythagoras's theorem is a statement about objects that have no width, mass, or time duration. It is not a statement about depressions in sand, sticks, or strings. [...] The fact that in a right triangle, the sum of the squares of the legs equals the square of the hypotenuse was true long before Pythagoras or even planet Earth was around; that it was discovered by some humans (long before Pythagoras, actually) has no bearing on its validity.
However, I had also dug myself into a bit of a hole:
Yes, the boundary between "discovery" and "invention" is indeed blurry; I am not sure I can give a meaningful answer to whether chess was invented or discovered.
And now, thanks to Scott, I think I can dig myself out of that hole. We are going to define two realms: E (for Euclid) and B (for Borges). E contains all the mathematical "tautologies". Thus, if you seed E with the definition of a group, E will also contain all the facts about groups, including the theorems we've discovered, ones we've yet to discover, ones we'll never discover, and ones that are true but unprovable. B is a much more boring set -- it is the collection of all possible statements, true and false, about anything. It includes a statement and proof of Pythagoras's theorem (and its negation with a false proof), a description of the game of chess (and its infinite variations), as well as lots of pure gibberish.

Now I can make a meaningful distinction between invention and discovery. We discover elements of E, but invent elements of B. We discover mathematical truths, but invent proof techniques. The game of chess belongs squarely in B, and thus is an invention.

And what bearing does this have on Scott's comment? Well, E consists of self-contained truths, or tautologies. We can only access a tautology via a proof. The heart of math isn't making true statements, it's finding clever proofs!

### Paul Cohen ז"ל + language, human and otherwise

The discussion following Scott's post on the passing of Paul Cohen has segued into learnability of automata and languages, a topic I take a keen interest in. I've left some comments. Check it out!

## Thursday, March 22, 2007

### Geometry question resolved + debriefing

First, I should note that the margin question came up in my work with Corinna Cortes and Mehryar Mohri; see our ALT'06 and (to appear) COLT'07 papers.

Next, a shameful admission: I seem to have forgotten how logical contrapositives work. As Avrim reminds me, the perceptron algorithm will terminate in O(1/gamma^2) steps if the data are separable by a margin of at least gamma. Let me quote Avrim:
what you show is that any separator must have w_n that is exponential in w_1, and that furthermore, flipping the sign ofw_1 makes the separator inconsistent. That implies that for any consistent separator, there is an exponentially small change in its angle that makes it inconsistent, which means it has an exponentially small margin.
Finally, a bit of philosophy of math. One should resist the temptation to run numerical simulations, taking the time to think things through with pen and paper first.

## Wednesday, March 21, 2007

### Progresss on the margin problem

Avrim Blum alerts me to something I should've remembered from his class -- one can force a perceptron to make exponentially many mistakes (i.e., achieve an exponentially small margin) with a properly concocted decision list. In fact, digging up my solution to his '04 final exam problem, here is such an example. Assuming my readers think in MATLAB (much as Scott thinks in BASIC), consider the following function:

X = eye(n);
for k=2:2:n
X(1:1+n-k,k:n) = X(1:1+n-k,k:n) + eye(n-k+1);
end
X = [[X;zeros(1,n)] ones(n+1,1)];
Y = (-ones(n+1,1)).^([2:n+2]');
return

Here's the output [X Y] for n=9:
1 1 0 1 0 1 0 1 1 1
0 1 1 0 1 0 1 0 1 -1
0 0 1 1 0 1 0 1 1 1
0 0 0 1 1 0 1 0 1 -1
0 0 0 0 1 1 0 1 1 1
0 0 0 0 0 1 1 0 1 -1
0 0 0 0 0 0 1 1 1 1
0 0 0 0 0 0 0 1 1 -1
0 0 0 0 0 0 0 0 1 1

(the vectors x in {0,1}^9 are treated as rows, and the labels y = +/-1 are appended at the end).

Using badmargin(n), for n=2 to 20, to generate labeled data sets and running SVM on these, we get the following plot:

which is highly suggestive of exponential decay. It remains to actually prove that this explicit construction achieves exponentially small max-margin. Any takers?

## Tuesday, March 20, 2007

### Elementary Maurey's theorem + ode to interaction

Maurey's theorem (rather, one of them) states that if P is the Haar (i.e., normalized counting) measure on S_n (the group of all permutations on n elements) and d is the normalized Hamming metric on this group, then
P(f-Ef > t) <= exp(-nt^2 / 8),
where f:S_n -> R is 1-Lipschitz w.r.t. d. See Schechtman's paper for a proof; it is done via the notion of metric space length.

In our conversation at Berkeley, Jim Pitman suggested a simple way to parametrize uniformly random permutations as an independent process. At each step i, 1<=i<=n, you uniformly pick an integer between 1 and i as the location in which to insert the next element. Thus we can recover Maurey's theorem in a simpler way, and with better constants, via McDiarmid's inequality:
P(f-Ef > t) <= exp(-2nt^2),
where P and f are as above. At first I couldn't believe it would be this simple, but Gideon tells me that I'm right.

The greater moral of this story is that I should talk to people more. I can't overstate the value of this blog as a research tool -- look no further than the high quality discussions here, here, here, and here (and this isn't counting the email exchanges that some posts generate). But you can't count on the experts to always stumble on your blog post -- nothing beats talking to one in person!

## Monday, March 12, 2007

### Miscellanea

Slow blogging this week as I'm in California, giving talks at Stanford and Berkeley (yeah, it's the same talk; they're both probability seminars and this is essentially my thesis talk). Swing by if you can!

## Tuesday, March 6, 2007

### Geometry of DFAs

As before, define to be the set of all deterministic finite-state automata (DFAs) on states over some fixed alphabet . As usual, we require that every DFA start in state 1, meaning that . Define the following metric on : for , let be their symmetric-difference DFA -- meaning that accepts precisely the strings that are either in or in but not in both. Constructing such an automaton is easy, as is minimizing it. So, assuming is in minimized form, define , where is the number of states in a DFA. It is straightforward to verify that is a valid (pseudo-)metric on .

This notion came up in my conversation with Jason Franklin, who was looking for a natural metric on automata, for security-related reasons. I suggested the metric above, and I'm wondering if it's appeared anywhere in literature. It's a natural measure of the complexity of the language on which two DFAs disagree.

Any finite metric space has a well-defined length; the definition may be found in Schechtman's paper. Let be any finite metric space. For , define an -partition sequence of to be a sequence

of partitions of such that refines and whenever and, there is a bijection such that for all .
The length of is defined to be the infimum over all -partition sequences.

Bounding the length of has immediate applications to -concentration under the normalized counting measure on , as explained here.

I am interested in computing (really, upper-bounding) the length of . Any takers?