Introducing Psychometric AI v2 4.6.02 -- not for circulation

Introducing Psychometric AIv2 4.6.02 -- not for circulation SelmerBringsjord & Bettina Schimanski & …? Department of Cognitive Science Department of Computer Science RPI Troy NY 12180 Selmer would like to express his deep gratitude to ETS, b/c w/o its support of the eWriter Project, the new form of AI proposed/described herein might well never have occurred to him.

Roots of this R&D…

Seeking to Impact a # of Fields • This work weaves together relevant parts of: • Artificial Intelligence: Build machine agents to “crack” and create tests. • Psychology: Use experimental methods to uncover nature of human reasoning used to solve test items. • Philosophy: Address fundamental “big” questions, e.g., What is intelligence? Would a machine able to excel on certain tests be brilliant?… • Education: Discover the nature of tests used to make decisions about how students are taught what, when. • Linguistics: Reduce reasoning in natural language to computation. Many applications!

The Primacy of Psychology of Reasoning There is consensus among the relevant luminaries in AI and theorem proving and psychology of reasoning and cognitive modeling that: machine reasoning stands to the best of human reasoning as a rodent stands to the likes of Kurt Godel. In the summer before Herb Simon died, in a presentation at CMU, he essentially acknowledged this fact -- and set out to change the situation by building a machine reasoner with the power of first-rate human reasoners (e.g., professional logicians). Unfortunately, Simon passed away. Now, the only way to fight toward his dream (which of course many others before him expressed) is to affirm the primacy of psychology of reasoning. Otherwise we will end up building systems that are anemic. The fact is that first-rate human reasoners use techniques that haven't found their way into machine systems. E.g., humans use extremely complicated, temporally extended mental images and associated emotions to reason. No machine, no theorem prover, no cognitive architecture, uses such a thing. The situation is different than chess -- radically so. In chess, we knew that brute force could eventually beat humans. In reasoning, brute force shows no signs of exceeding human reasoning. Therefore, unlike the case of chess, in reasoning we are going to have to stay with the attempt to understand and replicate in machine terms what the best human reasoners do. We submit that a machine able to prove that the key in an LR/RC problem is the key, and that the other options are incorrect, is an excellent point to aim for, perhaps the best that there is. As a starting place, we can turn to simpler tests. Multi-Agent Reasoning, modeled in Mental Metalogic, is the key to reaching Simon’s Dream! Pilot experiment shows that groups of reasoners instantly surmount the errors known to plague individual reasoners! Come Wed 2.27.02 12n SA3205 “Chess is TooEasy”

What is Psychometric AI?

An Answer to: What is AI? A New Kind of AI • Assume the ‘A’ part isn’t the problem: we know what an artifact is. • Psychometric AI offers a simple but radical answer: • Psychometric AI is the field devoted to building information-processing entities (some of which will be robots) capable of at least solid performance on all established, validated tests of intelligence and mental ability, a class of tests that includes IQ tests, tests of reasoning, of creativity, mechanical ability, and so on. • Don’t confuse this with: “Some human is intelligent…” • Psychologists don’t agree on what human intelligence is. • Two notorious conferences. See The g Factor. • But we can agree that one great success story of psychology is testing, and prediction on the basis of it. (The Big Test) AI is the field devoted to building intelligent artificial agents, i.e., agents capable of solid performance on intelligence tests. Therefore…

Some of the tests…

Intelligence Tests: Narrow vs. Broad Thurstone’s view of intelligence Spearman’s view of intelligence

Let’s look @ RPM(AI-based replication of Carpenter et al)(Sample 1)

RPM Sample 2

RPM Sample 3

Artificial Agent to Crack RPM ---------------- PROOF ---------------- 1 [] a33!=a31. 3 [] -R3(x)| -T(x)|x=y| -R3(y)| -T(y). 16 [] R3(a31). 24 [] T(a31). 30 [] R3(a33). 31 [] T(a33). 122 [hyper,31,3,16,24,30,flip.1] a33=a31. 124 [binary,122.1,1.1] $F. ------------ end of proof ------------- ----------- times (seconds) ----------- user CPU time 0.62 (0 hr, 0 min, 0 sec)

Artificial Agent to Crack RPM ---------------- PROOF ---------------- 1 [] a33!=a31. 7 [] -R3(x)| -StripedBar(x)|x=y| -R3(y)| -StripedBar(y). 16 [] R3(a31). 25 [] StripedBar(a31). 30 [] R3(a33). 32 [] StripedBar(a33). 128 [hyper,32,7,16,25,30,flip.1] a33=a31. 130 [binary,128.1,1.1] $F. ------------ end of proof ------------- ----------- times (seconds) ----------- user CPU time 0.17 (0 hr, 0 min, 0 sec)

Artificial Agent to Crack RPM =========== start of search =========== given clause #1: (wt=2) 10 [] R1(a11). given clause #2: (wt=2) 11 [] R1(a12). given clause #3: (wt=2) 12 [] R1(a13). ... given clause #4: (wt=2) 13 [] R2(a21). given clause #278: (wt=16) 287 [para_into,64.3.1,3.3.1] R2(x)| -R3(a23)| -EmptyBar(y)| -R3(x)| -EmptyBar(x)| -T(a23)| -R3(y)| -T(y). given clause #279: (wt=16) 288 [para_into,65.3.1,8.3.1] R2(x)| -R3(a23)| -StripedBar(y)| -R3(x)| -StripedBar(x)| -EmptyBar(a23)| -R3(y)| -EmptyBar(y). Search stopped by max_seconds option. ============ end of search ============ Correct!

Possible Objection “If one were offered a machine purported to be intelligent, what would be an appropriate method of evaluating this claim? The most obvious approach might be to give the machine an IQ test … However, [good performance on tasks seen in IQ tests would not] be completely satisfactory because the machine would have to be specially prepared for any specific task that it was asked to perform. The task could not be described to the machine in a normal conversation (verbal or written) if the specific nature of the task was not already programmed into the machine. Such considerations led many people to believe that the ability to communicate freely using some form of natural language is an essential attribute of an intelligent entity.” (Fischler & Firschein 1990, p. 12)

WAISA Broad Intelligence Test…

Cube Assembly Basic Setup Problem: Solution:

Harder Cube Assembly Basic Setup Problem: Solution: The robot in Selmer’s lab that will be able to excel on the WAIS and other tests. We don’t yet have a name for our artificial master of tests. MIT has COG. What should the name be? Suggestions are welcome! Send to selmer@rpi.edu.

Picture Completion

Picture Completion Currently untouchable AI -- but we shall see.

And ETS’ tests…

The “Lobster” Lobsters usually develop one smaller, cutter claw and one larger, crusher claw. To show that exercise determines which claw becomes the crusher, researchers placed young lobsters in tanks and repeatedly prompted them to grab a probe with one claw – in each case always the same, randomly selected claw. In most of the lobsters the grabbing claw became the crusher. But in a second, similar experiment, when lobsters were prompted to use both claws equally for grabbing, most matured with two cutter claws, even though each claw was exercised as much as the grabbing claws had been in the first experiment. Which of the following is best supported by the information above? A Young lobsters usually exercise one claw more than the other. B Most lobsters raised in captivity will not develop a crusher claw C Exercise is not a determining factor in the development of crusher claws in lobsters. D Cutter claws are more effective for grabbing than are crusher claws. E Young lobsters that do not exercise either claw will nevertheless usually develop one crusher and one cutter claw.

Same Approach Used ---------------- PROOF ---------------- 1 [] -Lobster(x)|Cutter(r(x)). 3 [] -Lobster(x)| -Exercise(r(x))| -Exercise(l(x))|Cutter(l(x)). 4 [] -Lobster(x)| -Cutter(r(x))| -Cutter(l(x)). 5 [] Lobster($c1). 6 [] Exercise(r($c1)). 7 [] Exercise(l($c1)). 9 [hyper,5,1] Cutter(r($c1)). 10 [hyper,7,3,5,6] Cutter(l($c1)). 11 [hyper,10,4,5,9] $F. ------------ end of proof ------------- ----------- times (seconds) ----------- user CPU time 0.38 (0 hr, 0 min, 0 sec) Therefore option A Is correct!

Underlying Math …explained by hand as it’s a bit intricate…

More Careful Look in : x (L(x)  ((C(l(x)) R(r(x)))  (C(r(x))  R(l(x))))) in : x (L(x)  ((C(l(x)) R(r(x))  L(r(x),l(x)))  (C(r(x))  R(l(x))  L(l(x),r(x)))))

Comments on RC Items… Many critics of Emily Bronte’s novel Wurthering Heights see its second part as a counterpoint that comments on, if it does not reverse, the first part, where a “romantic” reading receives more confirmation. Seeing the two parts as a whole is encouraged by the novel’s sophisticated structure, revealed in its complex use of narrators and time shifts. Granted that the presence of these elements need not argue an authorial awareness of novelistic construction comparable to that of Henry James, their presence does encourage attempts to unify the novel’s heterogeneous parts. However, any interpretation that seeks to unify all of the novel’s diverse elements is bound to be somewhat unconvincing. This is not because such an interpretation necessarily stiffens into a thesis (although rigidity in an interpretation of this or of any novel is always a danger), but because Wurthering Heights has recalcitrant elements of undeniable power that, ultimately, resist inclusion in an all-encompassing interpretation. In this respect, Wuthering Heights shares a feature of Hamlet.

“Wuthering Heights”… Many critics of Emily Bronte’s novel Wurthering Heights see its second part as a counterpoint that comments on, if it does not reverse, the first part, where a “romantic” reading receives more confirmation. Seeing the two parts as a whole is encouraged by the novel’s sophisticated structure, revealed in its complex use of narrators and time shifts. Granted that the presence of these elements need not argue an authorial awareness of novelistic construction comparable to that of Henry James, their presence does encourage attempts to unify the novel’s heterogeneous parts. However, any interpretation that seeks to unify all of the novel’s diverse elements is bound to be somewhat unconvincing. This is not because such an interpretation necessarily stiffens into a thesis (although rigidity in an interpretation of this or of any novel is always a danger), but because Wurthering Heights has recalcitrant elements of undeniable power that, ultimately, resist inclusion in an all-encompassing interpretation. In this respect, Wuthering Heights shares a feature of Hamlet. According to the passage, which of the following is a true statement about the First and second parts of Wurthering Heights? The second part has received more attention from critics. . . .

Additional Objections…

PAI is too idiosyncratic! • Actually, PAI can be viewed as a generalization of the Turing Test-based answer to “What is AI?” • AI is the field devoted to building artificial agents capable of passing the Turing Test. (As affirmed in a number of texts.) • PAI has the major advantage of requiring high performance on many different tests, all, unlike TT, grounded in psychology.

But your applications are only in Testing! • No. An agent able to perform well on all these tests can do everything and then some. • If we believe that psychology has, through tests, isolated, in gem-like fashion, what’s most important in cognition, then powerful agents in PAI will be powerful agents, period.

Psychometric AIin Context …

A Classic “Cognitive System” Setup Under Development Cognitive System Choice of correct option, and ruling out of others, and… Test Item “percept” “action” actions that involve physical manipulation of objects and locomotion.

Fits forthcomingSupermindsbook by Bringsjord & Zenzen… • “Weak” AI based on testing going back to Turing is implied for the practice of AI.

Fits “Complete” CogSci…

Perception and Action Low-level High-level Perception subdeclarative computation Environment Action Cognitive System

Cognitive Modeling Low-level High-level Perception subdeclarative computation Short Term Memory Environment ACT-R Long Term Memory Perception & Action Action Cognitive System

Reasoning (Bringsjord) Low-level High-level Perception subdeclarative computation Short Term Memory Syntactic Reasoning Mental Metalogic Environment ACT-R Long Term Memory Semantic Reasoning Perception & Action Action Cognitive System

Cognitive Human Factors: Engineering the Interface b/t Cognitive Systems and their Environments Low-level High-level Perception subdeclarative computation Short Term Memory Syntactic Reasoning Mental Metalogic Environment ACT-R Long Term Memory Semantic Reasoning Perception & Action Action Cognitive System

Large Variation in Difficulty

Evan’sANALOGYProgram

Introducing Psychometric AI v2 4.6.02 -- not for circulation

Introducing Psychometric AI v2 4.6.02 -- not for circulation

Presentation Transcript

Introducing ENMAT Energy V2

PRELIMINARY DRAFT not for circulation

PRELIMINARY DRAFT not for circulation

PRELIMINARY DRAFT not for circulation

PRELIMINARY DRAFT not for circulation

Introducing Psychometric AI

PRELIMINARY DRAFT not for circulation

PRELIMINARY DRAFT not for circulation

PRELIMINARY DRAFT not for circulation

Introducing Psychometric AI

PRELIMINARY DRAFT not for circulation

PRELIMINARY DRAFT not for circulation

PRELIMINARY DRAFT not for circulation

PRELIMINARY DRAFT not for circulation

PRELIMINARY DRAFT not for circulation

PRELIMINARY DRAFT not for circulation

PRELIMINARY DRAFT not for circulation

PRELIMINARY DRAFT not for circulation

PRELIMINARY DRAFT not for circulation

PRELIMINARY DRAFT not for circulation

PRELIMINARY DRAFT not for circulation