Taxonomy/Classification

Odud on 2002-09-24T14:01:15

I've been looking for Perl scripts/modules that will help with developing a taxonomy/classification system. Basically I want to be able to ask n questions and then based on the answer select a further m questions to ask etc. etc. until the end of the tree is reached.

The application I have in mind is identification of typefaces - but I don't think that is particularly relevant.

For UK readers with long memories there was a program for the BBC Model B that "understood" animals and would do a reasonable job of "guessing" what animal you were thinking of based on simple questions about number of legs/wings/horns etc. It also could be configured to add new animals - provided that you could give a distinguishing question. At the moment this is the sort of complexity I need.

Searching CPAN (keywords: taxonomy, classification) hasn't yielded anything that looks relevant - for the first time ever in my experience! - is this a wheel that hasn't been invented yet?

Can anybody point me in the right direction?

Update: The answer is probably Clips with a Perl front end to make the question/answer part easy.


Animals program

jdavidb on 2002-09-24T14:12:15

I remember the exact same program for the Apple II in BASIC. I never understood how it worked. (Hey, I was in fourth grade! :) )

In fact, I thought of that program before you even wrote that paragraph.

I always wanted to have a program that worked like that but had the ability to "forget." I tried to make that work in a similar (but simpler) BASIC program in eighth grade. Too bad I didn't have Perl and Tie::Hash::Cannabinol then. :)

Re:Animals program

Odud on 2002-09-24T16:28:57

Found this while Googling. It's Prolog and doesn't learn.

Re:Animals program

wferraz on 2006-05-09T13:22:00

Here is the original Animals Integer Basic program source code:

0 GOTO 1000
1 REM *************************
2 REM * *
3 REM * ANIMALS: *
4 REM * COPYRIGHT 1978 BY *
5 REM * APPLE COMPUTER INC. *
6 REM * *
7 REM *************************
100 CUR = 1 : REM HERE IS WHERE PROGRAM IS PLAYED
110 PRINT OP$
120 PRINT RF$;CUR
130 INPUT NEW$ : IF NEW$(1,1) = "!" THEN 500 : INPUT RTPTR : INPUT WRNGPTR
140 REM NOW HAVE QUESTION TO ASK. RTPTR POINTS AT RECORD TO GO TO IF HE ANSWERS YES, WRNGPTR IF HE SAYS NO.
150 PRINT CL$
160 PRINT : PRINT NEW$; : INPUT "?",A$ : PREV = CUR : IF NOTLEN(A$) THEN 160 : A$ = A$(1,1) : IF A$ # "Y" AND A$ # "N" THEN 160
170 IF A$ = "Y" THEN CUR = RTPTR : IF A$ = "N" THEN CUR = WRNGPTR : GOTO 110
500 REM HAVE REACHED AN ANIMAL ENTRY. IS IT THE RIGHT ONE?
510 PRINT CL$ : AN$ = "AEIOU" : PRINT "IS THE ANIMAL YOU'RE THINKING OF A"; : FOR VWL = 1 TO 5
520 IF AN$(VWL,VWL) = NEW$(2,2) THEN PRINT "N"; : NEXTVWL : PRINT : PRINT NEW$(2); : INPUT "?",AN$ : IF NOTLEN(AN$) THEN 510
530 AN$ = AN$(1,1) : IF AN$ # "Y" AND AN$ # "N" THEN 510 : IF AN$ = "Y" THEN 800
540 REM I DON'T KNOW WHAT ANIMAL IT IS, ASK HIM
550 PRINT : PRINT : PRINT "ALL RIGHT, I GIVE UP. WHAT ANIMAL WERE" : PRINT "YOU THINKING OF?";
560 ANIM$ = "!" : INPUT ANIM$(2)
570 PRINT : PRINT "PLEASE TYPE A QUESTION THAT WOULD" : PRINT "DISTINGUISH BETWEEN A ";NEW$(2) : PRINT "AND A ";ANIM$(2);
580 INPUT ":",Q$ : IF Q$(LEN(Q$)) = "?" THEN Q$ = Q$(1,LEN(Q$)-1)
590 PRINT : PRINT "WHAT WOULD THE CORRECT ANSWER FOR A" : PRINT ANIM$(2);" BE?"; : INPUT AN$ : AN$ = AN$(1,1) : IF AN$ # "Y" AND AN$ # "N" THEN 590
600 PRINT OP$
610 PRINT RF$;0
620 REM UPDATE HEADER RECORD.
630 INPUT CNT,LAST : PRINT WF$;0
640 CNT = CNT+1 : LAST = LAST+2 : PRINT CNT;",";LAST : PRINT RF$;PREV
650 INPUT NEW$ : INPUT OLDR : INPUT OLDW : PRINT WF$;PREV
660 PRINT NEW$ : IF A$ = "N" THEN 670 : PRINT LAST-1 : PRINT OLDW : GOTO 680
670 PRINT OLDR : PRINT LAST-1
680 PRINT WF$;LAST-1
690 PRINT Q$ : IF AN$ = "Y" THEN 700 : PRINT CUR : PRINT LAST : GOTO 710
700 PRINT LAST : PRINT CUR
710 PRINT WF$;LAST
720 PRINT ANIM$ : PRINT CL$
730 PRINT "I NOW KNOW ";CNT;" ANIMALS!" : GOTO 830
800 REM I GOT IT RIGHT!!!
810 FOR L = 1 TO 10 : PRINT : NEXTL
820 PRINT "I GOT IT RIGHT!!!"
830 PRINT : INPUT "WANT TO PLAY AGAIN? ",A$ : IF LEN(A$) THEN IF A$(1,1) = "N" THEN END : GOTO 100
1000 REM ***************************
1001 REM * *
1002 REM * ANIMAL: *
1003 REM * PROGRAM THATS LEARNS *
1004 REM * ABOUT VARIOUS ANIMALS *
1005 REM * FROM THE USER. *
1006 REM * *
1007 REM * SYSTEM REQUIREMENTS: *
1008 REM * 16K MEMORY, DISK II *
1009 REM * *
1010 REM * R. WIGGINTON *
1011 REM * 06/01/78 *
1012 REM * *
1013 REM * UPDATED: 1/12/79 *
1014 REM ***************************
1020 PRINT "NOMON I,O,C"
1030 TEXT : CALL -936 : DIM NEW$(70),ANIM$(70),Q$(70),A$(70),RF$(20),WF$(20),AN$(10),CL$(20),OP$(40)
1040 RF$ = "READ ANIMALSFILE,R" : WF$ = "WRITE ANIMALSFILE,R" : CL$ = "CLOSE" : OP$ = "OPEN ANIMALSFILE,L80" : PRINT OP$
1041 REM THE ABOVE STRINGS WERE SET SO THAT WE ONLY NEED TO PRINT A STRING RATHER THAN TYPE THE WHOLE OPERATION. EXAMPLE:
1042 REM RATHER THAN PRINT D$;"READ ANIMALSFILE,R" WE PRINT RF$. NOTE THAT THE ",R" IS FOR RANDOM ACCESS INTO FILES.
1045 PRINT "* WARNING: YOUR DISKETTE MAY NOT BE" : PRINT "WRITE PROTECTED IN ORDER TO PLAY THIS" : PRINT "GAME!"
1050 VTAB 5 : PRINT " **** ANIMAL LEARNING GAME ****" : PRINT : PRINT "HAVE YOU EVER PLAYED BEFORE?"; : INPUT A$ : IF LEN(A$) THEN IF A$(1,1) = "Y" THEN 100
1060 PRINT : PRINT "INSTRUCTIONS:" : PRINT : PRINT "YOU WILL THINK OF AN ANIMAL, AND I WILL" : PRINT "TRY TO GUESS WHAT ANIMAL YOU ARE"
1070 PRINT "THINKING OF. I WILL DO THIS BY ASKING" : PRINT "YOU A SERIES OF QUESTIONS ABOUT YOUR"
1080 PRINT "ANIMAL, TO WHICH YOU RESPOND EITHER" : PRINT "YES OR NO. (ACTUALLY, A SIMPLE 'Y' OR" : PRINT "'N' RESPONSE IS SUFFICIENT)"
1090 PRINT "IF I DO NOT GUESS WHAT YOUR ANIMAL IS," : PRINT "I WILL ASK YOU A FEW QUESTIONS SO THAT"
1100 PRINT "NEXT TIME I WILL KNOW WHAT YOUR ANIMAL" : PRINT "IS." : PRINT
1110 PRINT "DO YOU WANT ME TO START LEARNING, OR" : PRINT "HAVE YOU TAUGHT ME ANIMALS BEFORE?"
1111 PRINT "TYPE 'NEWFILE' FOR ME TO FORGET WHAT I" : PRINT "HAVE LEARNED UP TO NOW OR IF NO ONE HAS"
1112 INPUT "TAUGHT ME ANYTHING YET: ",A$ : IF A$ # "NEWFILE" THEN 100
1120 PRINT WF$;0
1130 PRINT "2,4"
1140 PRINT WF$;1
1150 PRINT "DOES IT LIVE IN THE WATER" : PRINT 2 : PRINT 3
1160 PRINT WF$;2
1170 PRINT "!FROG"
1180 PRINT WF$;3
1190 PRINT "!MOOSE"
1200 GOTO 100
2000 REM -----------------------------
2010 REM PROGRAM EXPLANATION
2020 REM -----------------------------
2030 REM
2040 REM BASICALLY, THIS PROGRAM
2050 REM ILLUSTRATES THE USE OF A
2060 REM BINARY TREE STRUCTURE. SINCE
2070 REM THE ANSWER TO A QUESTION MUST
2080 REM EITHER BE A YES OR NO, WE
2090 REM WILL TRAVERSE THE TREE EITHER
2100 REM LEFT OR RIGHT. IF WE DO NOT
2110 REM GUESS THE ANIMAL CORRECTLY,
2120 REM WE SIMPLY GET THE NEW ANIMAL
2130 REM AND A QUESTION TO DISTINGUISH
2140 REM IT FROM OTHER ANIMALS, AND
2150 REM UPDATE THE TREE STRUCTURE.
2160 REM
2170 REM ORIGINAL PROGRAM IDEA:
2180 REM WHO KNOWS?
2190 REM
2200 REM COPYRIGHT 1978 APPLE COMPUTER INC.

notes:
1) small squares at lines 730, 820 and 1045 are chr$(07), the apple's char for a beep sound.
2) small squares at lines 1020 and 1040 are chr$(04), used to prefix DOS commands.
3) At ftp://ftp.apple.asimov.net/pub/apple_II/ there are a lot of resources for Apple II emulation.

Carpe Diem

Expert System

Dom2 on 2002-09-24T14:29:25

I call that sort of thing an "Expert System", but that doesn't show up any hits on CPAN either... Sorry.

-Dom

here's my version of "animal"

merlyn on 2002-09-24T15:11:21

#!/usr/bin/perl -w
use strict;

use Data::Dumper;

my $info = "dog";

{
  try($info);
  redo if (yes("play again?"));
}
print "Bye!\n";
print Dumper($info);

sub try {
  my $this = $_[0];
  if (ref $this) {
    return try($this->{yes($this->{Question}) ? 'Yes' : 'No' });
  }
  if (yes("Is it a $this")) {
    print "I got it!\n";
    return 1;
  };
  print "no!?  What was it then? ";
  chomp(my $new = <STDIN>);
  print "And a question that distinguishes a $this from a $new would be? ";
  chomp(my $question = <STDIN>);
  my $yes = yes("And for a $new, the answer would be...");
  $_[0] = {
           Question => $question,
           Yes => $yes ? $new : $this,
           No => $yes ? $this : $new,
          };
  return 0;
}

sub yes {
  print "@_ (yes/no)?";
  <STDIN> =~ /^y/i;
}

Re:here's my version of "animal"

Odud on 2002-09-24T16:53:06

Thanks for that - I suspect it may be a good start towards what I'm looking for. I like the way it builds the tree inside $info. Although I'll probably need multiple choices.
 
There's a Prolog program here that looks a promising start as well.