Saturday, January 2, 2010

A cheating Quine?

So I'm teaching a course on automata theory -- again. (A brief personal update: I've started a faculty position at BGU, got married and had a baby -- not in that order.)

I want to teach the recursion theorem this week. The theorem states that no matter what Turing machine one is designing, one can always assume that it has access to its own description.

To me, this always seemed painfully obvious. Once you accept that Turing machines and programs (in MATLAB, say -- to be concrete) are equivalent, the argument comes down to writing a program that can print its own code.

At first, writing a program that prints itself might seem impossible. After all, any sort of program that says PRINT X will be longer than the string X (because it contains X and the PRINT instruction) -- and so X can't be the program's whole description!

Indeed, such a simple strategy for a self-printing program is doomed to failure. However, who says the program must quote itself within itself verbatim? Maybe it can encode a description of itself in some compressed form, and execute a routine that decompresses and prints that description. Indeed, many such programs exist -- they are known as quines (after the great logician and philosopher Quine).

But it seems to me that these quines, while clever, are working too hard. Consider the following simple MATLAB function:

function quinecheat
fid = fopen('quinecheat.m','r');
str = char(fread(fid))';
% remove double line-skips:
str = strrep(str,[char(13) char(10)],char(10));
fclose(fid);
fprintf('%s\n',str);
return

If you save the code above as the MATLAB file 'quinecheat.m' and call quinecheat from the MATLAB command window, you will get a printout of the code.

On the one hand, you can do this in just about any programming language -- and any Turing machine T can assume it's being simulated on some universal Turing machine U and move U's tape head to the beginning of T's description. Also, I believe that this trivial "proof" of the recursion theorem retains the theorem's full power. For example, here is a simple proof that the halting problem is undecidable. Suppose to the contrary that some matlab function H inputs other matlab programs and outputs 1 if the input program halts (and 0 otherwise). Now consider the matlab function D which obtains its own description [D] and calls H([D]). If H([D])=1, D goes into an infinite loop; otherwise, D halts. We've reached our contradiction!

And yet I can't help but feel that I'm cheating somewhere. Is my program quinecheat a valid example of a self-printing program? Is the technique I am suggesting a valid alternative proof of the recursion theorem?

9 comments:

Aryeh said...

So here is a more "legit" quine function -- but it uses the matlab 'eval' command, so is it admissible?


function quine
qstr = char([100 105 115 112 40 91 39 81 123 39 32 110 117 109 50 115 116 114 40 106 41 32 39 125 32 61 32 39 39 39 32 81 123 106 125 32 39 39 39 59 39 93 41]);
Q{1} = 'function quine';
Q{2} = 'qstr = char([100 105 115 112 40 91 39 81 123 39 32 110 117 109 50 115 116 114 40 106 41 32 39 125 32 61 32 39 39 39 32 81 123 106 125 32 39 39 39 59 39 93 41]);';
Q{3} = 'disp(Q{1})';
Q{4} = 'disp(Q{2})';
Q{5} = 'for j=1:11';
Q{6} = ' eval(qstr)';
Q{7} = 'end';
Q{8} = 'for j=3:11';
Q{9} = ' disp(Q{j})';
Q{10} = 'end';
Q{11} = 'return';
disp(Q{1})
disp(Q{2})
for j=1:11
eval(qstr)
end
for j=3:11
disp(Q{j})
end
return

Unknown said...

Leo - you could simply have the turing machine or whatever have some sort of convention for the (fixed) starting "instruction pointer" (or equivalent) address and a notion of end-of-program marker and have the program thereby easily traverse its own definition, assuming of course that the machine is supported by a "stored program" memory architecture; that last assumption trivially holds on Turing-style machines.

=Rumi

Aryeh said...

I agree that there are all sorts of ways of making the recursion theorem easy. I think what I have in the post body is possibly THE simplest -- I just don't know if it's 100% legit.

Consider, on the other hand, the following "fully honest" MATLAB quine:

function quine
QSTR = char([81 123 36 125 32 61 32 39 42 39 59]);
Q{1} = 'function quine';
Q{2} = 'QSTR = char([81 123 36 125 32 61 32 39 42 39 59]);';
Q{3} = 'disp(Q{1})';
Q{4} = 'disp(Q{2})';
Q{5} = 'for j=1:14';
Q{6} = ' qjstr = QSTR;';
Q{7} = ' qjstr = strrep(qjstr,char(36),num2str(j));';
Q{8} = ' qjstr = strrep(qjstr,char(42),Q{j});';
Q{9} = ' disp(qjstr)';
Q{10} = 'end';
Q{11} = 'for j=3:14';
Q{12} = ' disp(Q{j})';
Q{13} = 'end';
Q{14} = 'return';
disp(Q{1})
disp(Q{2})
for j=1:14
qjstr = QSTR;
qjstr = strrep(qjstr,char(36),num2str(j));
qjstr = strrep(qjstr,char(42),Q{j});
disp(qjstr)
end
for j=3:14
disp(Q{j})
end
return


It was quite a bit trickier to contrive than my "cheating" quine -- and I am wondering if that hard work was really necessary.

mizhi said...

Some consider programs that use file i/o to input their source code to be cheating quines. I don't think it's been rigorously defined out of quine-ness, so it may just be a matter of opinion.

mizhi said...
This comment has been removed by the author.
mizhi said...

By the way, how's fatherhood treating you so far? I don't remember much about the first few weeks except for a blur of diapers, feedings, and sleep deprivation.

Aryeh said...

Thank G-d, it's a blessing... and yes, sleep has become a thing of the past.

Joram said...

I've found the following MATLAB quine in another blog:

a=['ns}z2e1kGe1116k6111gE16;:6ek7;:61gg3E1'];
disp(['a=[''',a,'''];',10,[a-10,']]);']]);

http://blog.garritys.org/2009/10/quines.html

Unknown said...

"On the one hand, you can do this in just about any programming language -- and any Turing machine T can assume it's being simulated on some universal Turing machine U and move U's tape head to the beginning of T's description."

This is generally not possible. When U simulates T, it provides a simulated tape for T (using a portion of U's tape), and translates all T's legal actions, modifying the simulated tape accordingly. The description of T is kept in a different part of U's tape, which T has no way to access.

In practice, programs in most languages could access their own source code if they know where it is (and some languages provide built-in ways to find it), but if a language has a compiler, people expect compiled executables to run correctly even if the source code is stored elsewhere (or even deleted).

Basically, a program's ability to access its own source code directly depends on quirks of the language and environment, and assuming it in proving general results about computability will weaken those results unnecessarily.