An Algorithmic and Social Introduction to Computer Science (CSC-105 2000S)

Do you consider the Turing test a reasonable way to evaluate the the success of an AI program? Why or why not?


Without a doubt the Turing test provides an indication to the success of an AI program. However, in my opinion, a successful program does not constitute an "intelligent" one. In attempting to determine whether a program possesses or is capable of exhibiting "intelligence," the Turing test is not the best nor is it the only test.

I do not believe that the Turing test would provide consistent results. Different people naturally think differently. However, in theory, I supposed if the AI program is good enough, then it would be able to convince anyone. But, does that mean that they will be convinced that it has a mind, similar to their own.

With all that I have said, I believe that the Turing test must be a reasonable way of evaluating AI programs. If it wasn't, it probably wouldn't be in practice. Of course, I think it is not the best test, but i'm guessing that there has not been any other ideas for a better test to make the Turing test obscure or obsolete.


I think that the Turing test covers what most people seem to think of AI as: "machines that think/act 'like people.'" We might disagree with this conception of AI, but at least it's a nice clean way of defining and evaluating programs that claim to be AI. In other words, while we might have philosophical (what is "like people?" what is it to "think?" etc.), moral (could AI eventually make humans obsolete? what are the implications of being "human-centric" on this issue? etc.), and scientific (perhaps we think that computers will never be able to do this, or that things computers can already do are 'intelligent,' or wonder how specifically the test's parameters could be fulfilled-"convince what human? for how long?" etc.) issues with the assumptions that ground the Turing Test, it seems to be one of the only concrete ways that has been proposed so far to evaluate the success of AI programs. We could argue forever about what intelligence is, but if we think AI is possible, we at least need some base definition and parameters for success and failure.


Well, I do plan on going back and rereading the part about the Turing test in our book, as I remember it being somewhat different from what you had originally described in class (though you did mention that there had been several variations from the original), but from what I remember right now, it's hard to say. On the one hand, since I don't believe it's possible to define a program as "intelligent," given all the problematics with that term, I guess I have a problelm with the mere premise. However, if I were to disregard that and focus on testing for this presumed intelligence, I think it's fair to say that the Turing test is "reasonable." I guess I would be fairly impressed if, from our interactions, through a wall or whatever barrier, a computer could convince me it was human. Especially because that doctor program we played with would clearly fail. A program that could overcome all the problems doctor had would be undoubtedly impressive, though I don't know that I'd call it intelligent...


I asked a computer science/math major last night about what the Turing Test was, and he told me that it was basically if the computer can convince another human in some form of interaction or communication that the computer is another person and not a machine, then the computer (neural network system or genetic programming) would be said to be "intelligent." I understand how this was a part of the readings, but I never really was sure of anything highlighted as the Turing Test from the readings. At any rate, if this definition is accurate, I would say that it has two flaws: 1. There are a number of "intelligences" which really have qualities a computer can't measure or understand because of their abstractness--so I think that it would be nearly impossible for me to be fooled by any neural network into believing that the computer thinks "just like a human." 2. I don't think that you can reliably test the intelligence of a program with subjective interactions with the computer; we need more standard definitions of what intelligent computers, or thinking computers, are actually composed by and therefore it's parameters. In a general sense, I think the Turing Test does give us some indication of a computer's relative intelligence, or rather, ability to interact intelligently with human life, but I am not sure it can be labeled a benchmark in the achievement of AI's contemporary fruition.


Dewdney describes the universal Turing machine as having a fixed progrom permanently embeded in its finite control unit, which mimics the action of an arbitrary Turing machine by reading that program on one tape and simulating its behavior on another. I don't know that this kind of mimicry and imitation is really indicative of intelligence. To some extent, certainly, but when I think about an intelligent machine, I think of a machine that has somewhat more independence in thought and action than merely being able to copy given behaviors. While the Turing test may be a "reasonable" way to evaluate the success of an AI program, it seems like a rather simplistic and narrow way in which to do so.


Before analysis of the Turing test can begin, a definition of the Turing test is required. Turing describes the Turing test as

It is played with three people, a man (A), a woman (B), and an interrogator (C) who may be of either sex. The interrogator stays in a room apart from the other two. The object of the game for the interrogator is to determine which of the two is the man and which is the woman. He knows them by labels X and Y, and at the end of the game he says either "X is A and Y is B" or "X is B and Y is A." The interrogator is allowed to put questions to A and B . . . We now ask the question, "What will happen when a machine takes the part of A in this game?" Will the interrogator decide wrongly as often when the game is played like this as he does when the game is played between a man and a woman? These questions replace our original "Can machines think?"

The original Turing test is quite simple. If the computer can mimic human behavior then the computer is intelligent. Many programs are able to do this for a short while. However, are these programs intelligent? Probably not. Replication of human behavior and intelligent need not be correlated. First, not all human behavior is intelligent. Second, replication is not intelligent; a parrot is able to speak and speaking replicates human behavior. Is a parrot an intelligent being? What happens if you ask the parrot a question; the answer is probably not particularly responsive to the question. Furthermore, (most) programs are limited to the parameters determined by the programmers, thus, programs are dependent on human knowledge because programs derive knowledge from human knowledge. There might be a jump in logic here. First, can knowledge be dependent, and, second, can a knowledgeable entity create another knowledge entity? Third, to what degree must the computer replicate human behavior. The ELIZA program is able to ask quasi-intelligent questions (sometimes responsive questions), however, ELIZA is unable to sustain human-like behavior in an extended dialogue.

Furthermore, I came across a set of criteria for intelligence. Jean Lassegue sets the criteria for intelligence at

If we assume that the test is feasible(referring to the Turing test), the continuum of organisms on the biological scale would then be enriched with a new kind of organism which could be specified by two characteristics: first, this new organism would be completely independent of its human creator and behave in its own particular way; secondly, its organic form itself would not depend on any particular material components but only on a purely logical structure.

If a program was able to met both of the criteria that Lassegue sets out, then, in my opinion, the program would be intelligence.


The Turing Test, as I understand it, is to have a human determine between another human and a computer (both are in another room or something) by speaking with them through the keyboard. Honestly, I think this is a reasonable test for AI, at least as a preliminary evaluation of the system. Time is a big factor though: I think that the longer it can go without the person knowing which one is the computer would be important, because I think eventually the examiner person would find out (it just depends on what questions he/she asks!) I also like the "modified Turing Test setup" as shown on which involves a human OR a computer responding to the examiner's questions (intead of a human AND a computer).

While AI has made great strides in the past fifty years, and I think that the Turing test could be a vital tool in measuring its continuing success, I think Turing himself expected too much. He predicted that "in about fifty years' time [by the year 2000] it will be possible to program computers ... to make them play the imitation game [or Turing Test] so well that an average interrogator will have no more than 70 per cent. chance of making the correct identification after five minutes of questioning." (Turing 1950, p.442). Then again, Turing also predicted that "at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted." (Turing 1950, p.452) And we all know this is not the case. Regardless of Turings predicted, though, I think the Turing Test is a very valuable tool in measuring the success of AI in the years to come. (


I suppose another way of thinking of this question goes something like: if a computer can convince someone it is human, does that make it intelligent? I played with the Eliza/Doctor program longer than just about anyone in the class, but because I thought it was stupid. I did take it seriously until its answers became less intelligent. It may not have convinced me there was a human on the other side, but F&M claim that it has convinced people in the past, but I suspect those people knew it was a computer program and qualified their responses accordingly.

However, the point doesn't lie with the program of Weizenbaum's programming. The point is that I can play a video game that acts like a human but isn't really being intelligent. It is very easy to mimic human behavior, even in a program, and I think that a program (or machine or whatever) has to be more than just a mimic to be intelligent. It has to go beyond that. My definition of intelligence, the ability to learn and adapt to a new situation and come up with entirely new ways of solving a problem, is almost caught by neural networks, but not quite. Neural networks appear to be along the road to where I think AI will eventually be. But Turing's test is too easy to get past, too easy for a cunning programmer to mimic behavior without actually making the computer intelligent. I think the test has to be a little more complicated. Could I derive a proper test? Maybe I could, maybe I couldn't, but I think Turing needs a more definitive test than just "act human."


I don't think the Turing test is a good one, because I think a machine needs to do more than seem human-like. While I agree with the point F&M made about raising the bar of what defines intelligence every time a computer reaches that point, I think there is a difference between imitating a human and thinking like one. There have been times I have called friends and thought their answering machine was a person. I would not say that machine was intelligent, it just resembled a human enough to confuse me. To me, intelligence needs to show signs of thought process, independent learning and innovation, creativity, and personality. I don't know if we ever will come up with that in a computer. Maybe AI is just another metaphor for the mind, to be replaced with a new technology and laughed at in a hundred years.


If the success of an AI program is based on the establishment of formal intelligence, I have to think that the Turing test is not a completely reasonable evaluation. This is mainly based on the fact that programs can have tricks that are more condusive to passing Turing test, so that they can score well without really being "intelligent." One point that I have come across that I'll introduce in discussion tomorrow is the idea that intellgient and thoughtful conversation must have the capacity for arguing. This standard would disqualify programs like ELIZA from being contemplated as "intelligent." These points will definitely be explored further tommorow.

Disclaimer Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.

This page may be found at

Source text last modified Wed Feb 16 08:16:09 2000.

This page generated on Fri Apr 21 09:44:21 2000 by Siteweaver. Validate this page's HTML.

Contact our webmaster at