Wednesday, May 16, 2007

Student projects

The semester is over, the grades should be in by now, and most of my students from FLAC are graduating. One of the components of the course was a research project, during which the instructors mentor the students on an individual basis. This has been quite a rewarding experience, since I managed to get four of my students interested in some deep and fascinating problems in automata theory, with connections to my own work. I note that all the students I mentored did a fine job and a couple of them taught me new things. But in this post, I'd like to showcase the projects with the strongest connection to automata theory and machine learning.

Jeremiah Blocki is that brave soul who took on the notorious problem #3. First, here is that long-overdue writeup where I define the universal regular kernel. In his paper, Jeremiah gives closed-form expressions for K_n(x,y) for short x and y, as well as proving some simple lower bounds on the complexity of K_n.

Vinay Chaudhary became interested in my work with Corinna Cortes and Mehryar Mohri on learning piecewise testable languages with the subsequence kernel. Aside from understanding our somewhat involved word-algebraic proof of linear separability, Vinay had to learn a whole bunch of subtle machine-learning notions, essentially on his own. Having gained a command of margins, support vectors, embeddings and kernels, he embarked on an investigation of the empirical margin of piecewise testable languages. Vinay produced a excellent piece of research, with some tantalizing leads.

Matthew Danish considered a problem that I'd attempted many moons ago and put aside -- namely, one of characterizing the languages that one obtains by taking finite Boolean combinations of modular n-grams. (A modular n-gram is the set of all strings in which a given contiguous substring occurs k mod n times.) Matt also had to master abstruse concepts such as syntactic monoids and morphisms, and produced a solid paper.

Jonah Sherman decided to aim high and attempt the star-height problem. I remember how mesmerised I was by this problem when I first encountered it some four years ago. When Jonah had asked me about its importance, I replied that if we can't answer such natural questions about regular languages then we are in dire need of better tools! That was good enough for him, and he dove into some rather dense semigroup theory, even rediscovering the rather nontrivial result that all languages recognized by finite commutative monoids have star height of at most one. Impressive work done by a college junior, check it out!

6 comments:

Andy D said...

Leo, this is really, really impressive. Great job mentoring.

I've been wondering about how to increase undergrad access to high-quality automata puzzles as a gateway into TCS; my Theory of Comp courses were plenty rigorous but never really tapped into the good stuff, puzzle-wise, and never sold students on theory.

So, if you know of any good single sources for students without a mentor like you, I'd love to hear about them.

Aryeh said...

Thanks for your very kind words, Andy. That's a very good question you raise -- and much of my pedagogical approach has been to continually challenge the students with aesthetic, thought-provoking problems, that often have deep connections to other areas of math.

I don't know of any single single such source. As textbooks, I've used Lewis and Papadimitriou as well as Sipser and Kozen; each is excellent in its own way.

The best problems I gave my students (aside, of course, from the time-tested classics) either came from my own research or were "inspired" by mistakes I saw in the homework. Here is a sampling:

Minimal consistent DFA

NFA-DFA blowup

Concatenation vs. star

Conservation of dimension and randomness

Clique bound on coloring numbers

I promise to continue to post new problems as they come up (and they constantly do!) and I ask you and other readers to contribute other good ones.

BTW, I see you're in San Diego; I'll be there in mid-June for the COLT conference -- will you be around?

Aryeh said...

I see I've got two broken links there. The url's are:

http://absolutely-regular.blogspot.com/2007/02/concatenation-implies-star.html

http://absolutely-regular.blogspot.com/2007/02/conservation-of-dimension-and.html

Andy D said...

Leo,
Alas, I can't make it to COLT, but I'll be in town thru the 13th. Let me know if you're around UCSD and/or want to hang out.

I will keep checking your page for puzzles. Thanks for the reading list; I do enjoy the exercise section in Kozen.

I wrote a blog entry a while ago about my love for automata puzzles/theorems. Most of them will probably be old news to you, but maybe you'd enjoy it:

http://andysresearch.blogspot.com/2006/10/in-praise-of-automata.html

Anonymous said...
This comment has been removed by a blog administrator.
pharmacy said...

It is great to hear that your student did a great job with it.