CATALOGUE

The following program allows you to build up a catalogue of books, records, or telphone numbers. It illustrates how a computer can be used to store information, sort it, and retrieve it on command.

A collection of records of information on a computer is called a "database"; in the following examples we will assume that the records consist of names and telephone numbers. The program allows you to associate a telephone number with each name. You can then find out someone’s telephone number by typing their name, or as many letters of their name as are needed to identify them uniquely. As an example, assume we are entering the telephone numbers of five people. The symbol '>' is used to prefix a name to be entered:

>RUN
COMMAND?>DEWAR J
?123 2234
COMMAND?>SMITH P S
?0223 314341
COMMAND?>NORTH Q
?119 2389 x191
COMMAND?>BOND J
?456 7789
COMMAND?>WEST A
?145 3456

We can then ask for an alphabetical list of the whole database, using the "*" command:

COMMAND?*
BOND J	456 7789
DEWAR J              123 2234
NORTH Q	119 2389 x191
SMITH P S	0223 314341
WEST A	145 3456

Alternatively, we can ask for the telephone number of any person in the database:

COMMAND? NORT
NORTH Q	119 2389 x191
COMMAND?S
SMITH P S	0223 314341

COMMAND?DEWER NOT FOUND

An attempt to enter a name already entered will give a warning message:

COMMAND?>BOND J
BOND J ALREADY EXISTS

Program Operation

The simplest way to store the records in the computer would be as a straightforward list. However, it would then be necessary to search through the whole list every time a new record was entered, or a record was being searched for. It would also be difficult to produce an alphabetical list of records, without first sorting all the records into order, which would be very time-consuming.

The Catalogue program therefore holds the records in a more sophisticated structure, called a 'tree'. Associated with every record are two 'pointers', which can be set to point to other records, or can be marked as pointing to nothing. As new records are entered, a tree is built up. Names lying earlier in the alphabet are inserted on the left-hand side of the tree, and names later in the alphabet on the right-hand side of the tree.

To see how this works in practice, a tree is built up with the five names given above. The first record goes at the top of the tree, and its two pointers are set to zero, indicated here by 'x', to indicate that there are no further records below it:

DEWAR
/   \
X   X

Suppose we then add SMITH. This is later in the alphabet than DEWAR, and so it is attached to the right-hand branch of the tree:

DEWAR
/    \
X     \
    SMITH
    /   \
    X   X

Next we add NORTH. First the name is compared with DEWAR, and since it is later in the alphabet we follow the right-hand branch. Then it is compared with SMITH. Since it is before SMITH in the alphabet we follow the left-hand branch. We have reached the end of the tree, and so NORTH is added there:

  DEWAR
 /     \
X       \
       SMITH
       /   \
      /    X
   NORTH 
   /   \
  X     X

Finally, we add BOND and WEST, and the tree looks like this:

  DEWAR
 /     \
X       \
       SMITH
       /   \
   NORTH   WEST
   /   \   /   \
  X     X X     X

When searching the tree for a name we follow the same procedure, until either the name is found, or the end of a branch is reached in which case the message ’NOT FOUND' is given.

The advantage gained by using a tree structure depends somewhat on the shape of the tree; with a perfectly balanced tree of 31 records, a maximum of five comparisons need to be made to find any record, as opposed to 31 with a simple list of records. As the number of records increases, the saving becomes even greater.

A tree can be printed in alphabetical order by using the following simple recursive procedure:

To print the tree starting at a certain name
a. print the tree pointed to by the left-hand pointer,
b. print the name, and print the tree pointed to by the right-hand pointer.

In the case of the tree above we obtain:

BOND, DEWAR, NORTH, SMITH, WEST

BBC Computer Version

5 REM ... Catalogue ...
10 DIM M%64,T%3:ZS=&3C00:F%=&1500:HIMEM=F%
20 !T%=0
30 REPEAT PRINT CHR$(12)"command"

Input command line. Commands: "
* - Print tree
> - Add name to tree.

40 INPUT $M%
50 IF?M%=ASC("*")PROCPRINT(!T%):GOTO 130 60 IF?M%=ASC(">")GOTO 100

Name on its own - search for it. If a close match is found print it (PROCPRINT), else say 'NOT FOUND'.

70 N%=M%:X%=T%:S%=1:E%=0:PROCSEARCH 
80 IF E% PROCPUT:GOTO 130
90 PRINT "NOT FOUND":GOTO 130

Add name to tree. If no match (not E) then all is well; otherwise the name already exists.

100 N%=M%+1:X%=T%:K%=0:S%=0:PROCSEARCH
110 IF NOT E% UNTIL FALSE
120 PRINT $N%;" ALREADY EXISTS"
130 K%=GET:UNTIL FALSE

PROCSEARCH - Search tree. Operation depends on value of S:
S%=0 - Add name $N% to tree and input telephone number when the end of a branch is reached.
S%=1 - Search tree for name, and return on a match of first few characters.

1000 REM SEARCH TREE 
1005 DEF PROCSEARCH 
1010 IF !X%=0 GOTO 1100

Follow down pointer at X%, and compare name $J on tree with $N. If they match, return. If $N>$J, move X% to right-hand pointer; then keep searching down tree.

1020 X%=!X%:J%=X%+8:PROCCOMP
1030 IF E% ENDPROC
1040 IF C% X%=X%+4
1050 GOTO 1010

End of branch. If S=l give up since the name is not found; otherwise add the name $N to the tree here input the telephone number.

1100 IF S% ENDPROC
1110 Y%=X%:!X%=F%:X%=!X%:!X%=0:X%!4=0
1120 X%=X%+8:$X%=$N%:XS=X%+LEN($X%)+1
1130 INPUT $X%:XS=X%+LEN($X%1+1
1140 IF(X% AND &FFFF)>Z% PRINT "NO ROOM":! Y%=0: K%=GET: ENDPROC
1150 F%=X%:ENDPROC

PROCCOMP - Compare strings J% and N% character by character, and set C% to 1 if $N%>$JS. If searching the tree, settle for $N% matching the first few characters of $JS; if adding a name to the tree, insist on an exact match.

2000 REM COMPARE $ J% AND SNAB
2010 DEF PROCCOMP I%=-1:REPEAT I%=I%+1
2020 UNTIL J%?I%<>N%?I% OR N%?I%=13
2030 C%=(J%?I%<N%?I%)
2040 IF S% El=(MS?I%=13):ENDPROC
2050 E%= ( $J%=$N% ): ENDPROC

PROCPRINT - Print tree from Q% downwards. Print tree on left-hand branch, print record at Q%, then print tree on right-hand branch.

4005 DEF PROCPRINT(Q%)
4010 IF Q%=0 ENDPROC
4020 PROCPRINT(!Q%)
4040 J%=Q%+8:PROCPUT:Q%=Q%+4
4050 PROCPRINT(!Q%):ENDPROC

PROCPUT - Print record at J%. Tabulate telephone number to column 20.

5000 REM PRINT RECORD
5010 DEF PROCPUT PRINT $J%;
5020 J%=J%+LEN($J%)+1:PRINT TAB(20);$J%:ENDPROC

Variables

E% - Equal flag; E%=1 if a match is found
F% - Next free memory location
J% - Name string on tree
M% - Command line string
N% - Name string
Q% - Local parameter for PROCPRINT
S% - Search flag: S%=1 means search tree; S%=0 means
T% - Pointer to top of tree
X% - Pointer to current position on tree
Y% - Pointer before adding name to tree
Z% - Top of available memory

Atom Version

This version of the catalogue program is substantially identical to the version for the BBC Computer. The only major difference is that the routine to print the tree has to save the value of X before re-entering itself recursively, since on the Atom procedures do not have local parameters.

5 REM ... CATALOGUE ...
10 DIM M(64),T(3),F(-1)
20 !T=0;Z=#3BFF
30 DO PRINT $12"command"

Input command line. Commands: "
* - Print tree
> - Add name to tree.

40 INPUT $M
50 IF?M=CH"*"X=!T;GOSUB p;GOTO n 
60 IF?M=CH">"GOTO a

Name on its own - search for it. If a close match is found print it (GOSUB q), else say 'NOT FOUND'.

70 N=M;X-Z;S=l;E=0;GOSUB s 
80 IF E GOSUB q;GOTO n
90 PRINT "NOT FOUND"';GOTO n

a - Add name to tree. If no match (not E) then all is well; otherwise the name already exists.

100aN=M+1;X=T;E=0;S=0;GOSUB s 
110 IF E:1 UNTIL 0
120 PRINT $N" ALREADY EXISTS" 
130nLINK#FFE3;UNTIL 0

s - Search tree. Operation depends on value of S:
S=0 - Add name $N to tree and input telephone number when the end of a branch is reached.
S=1 - Search tree for name, and return on a match of first few characters.

1000sREM SEARCH TREE
1010 IF !X=0 GOTO b

Follow down pointer at X, and compare name J on tree with $N. If they match, return. If $N>$J, move X to right-hand pointer; then keep searching down tree.

3020 X=!X;J=X+8;GOSUB c
1030 IF E RETURN
1040 IF C X=X+4
1050 GOTO s

b - End of branch.
If S=1 give up since the name is not found; otherwise add the name $N to the tree here, input the telephone number.

1100bIF S RETURN
1110 Y=X; !X=F; X=!X; !X=0; X!4 =0
1120 X=X+8; $X=$N;X=X+LENX+1
1130 INPUT$X; X=X+LENX+1
1140 IF X&#FFFF>Z PRINT "NO ROOM";
Y=0; LINK #FFE3; RETURN
1150 F=X; RETURN

c - Compare strings J and N character by character, and set C to 1 if $N>$J. If searching the tree, settle for $N matching the first few characters of $J; if adding a name to the tree, insist on an exact match.

2000cREM COMPARE $ J AND $N
2010 I=-1;DO I=I+1
2020 UNTIL J?I<>N?I OR N?I=13 2030 C=(J?I<N?I)
2040 IF S E=(N?I=13);RETURN
2050 E=($J=$N)gRETURN

p - Print tree from X downwards. First save X on stack. Print tree on left-hand branch, recover X and print record at X, then print tree on right-hand branch.

4000pREM PRINT TREE 4010 IF X=0 RETURN
4020 !F=X; F=F+2; X=!X; GOSUB p
4030 F=F-2;X=!F
4040 J=X+8;GOSUB q; X=X+4
4050 X=!X; GOTO p

q - Print record at J. Tabulate telephone number to column 20.

5000qREM PRINT RECORD5010 PRINT $J;DOPRINT " ";UNTIL COUNT=20 
5020 J=J+LENJ+1;PRINT $J'gRETURN

Variables:

E - Equal flag; E=l if a match is found
F - Next free memory location
J - Name string on tree
M - Command line string
N - Name string
S - Search flag: S=l means search tree; S=0 means add to tree
T - Pointer to top of tree
X - Pointer to current position on tree
Y - Pointer before adding name to tree
Z - Top of available memory

Further Suggestions

The program is not restricted to names and telephone numbers; in fact either field may be any sequence of up to 64 characters, and the whole of the first field is used for the alphabetical ordering.

A useful extension to the program would allow the database to be saved and loaded to and from tape. The values of !T% and F% (or !T and F in the Atom version) should be saved with the tree, since these give the address of the top of the tree and the next free location in memory respectively.