ANAGRAMS

Words are sequences of letters in exactly the same way that numbers are sequences of digits, but one striking difference between words and numbers is that whereas every sequence of digits is a valid number, not every sequence of letters is a valid word. Thus, although it is simple to write a program to generate random numbers, it is impossible to write a program to generate random words; without, of course, providing the computer with a dictionary in the first place.

The following program illustrates this by taking a word and finding every possible permutation of the letters in that word. Among these permutations will be all the anagrams of that word. For example, for the letters OPST the program will give:

WORD?OPST
THERE ARE 24 PERMUTATIONS:

OPST
OPTS
OSPT
OSTP
OTPS
OTSP
POST
POTS
PSOT
PSTO
PTOS
PTSO
SOPT
SOTP
SPOT
SPTO
STOP
STPO
TOPS
TOSP
TPOS
TPSO
TSOP
TSPO

If the original string is in alphabetical order, the program will produce the permutations in alphabetical order.


Program Operation

The permutation algorithm used by this program is one of the most efficient in producing a sequence of permutations in alphabetical order. It works as follows:

The first permutation is obviously the characters in alphabetical ordar; so, taking as an example the characters "ABCDEF", this is the first permutation. The last permutation is also obvious: it is sequence of characters in reverse order, ”FEDCBA".

To obtain the next permutation from any given sequence of letters, such as BCFEDA, we scan right-to-left looking a character that is smaller that the previous character. This will be called character 'I'. This character is then exchanged with the next higher character to its right, which will be called character 'J':

  I     J
  |     |
B C F E D A
   \   /
    \ /
     X
    / \
   /   \
B D F E C A

We then reverse the order of all the characters to the right of character I:

    I     J
    |     |
B D F E C A
     \   /
      \ /
       X
      / \
     /   \
B D A C E F

The result, "BDACEF", is the next permutation in alphabetical order.


BBC Computer Version

10 REM ... Anagrams 
40 INPUT"STRING"A$

Set up array of letter positions such that CC%(1)=1, CC%(2)=2 ... CC%(N)=N . Calculate number of ermutations, factorial N.

50 N=LEN(A$):DIM CC%(N)
60 F=1
100 FOR J=1TON:CC%(J)=J:F=F*J:NEXT
102 PRINT "THERE ARE" F "PERMUTATIONS" '

Display first permutation. Then permute word, and display.

105 PROCDISPLAY
110 FOR H=2TOF:PROCPERMUTE:PROCDISPLAY:NEXT 
120 END

PROCPERMUTE: permute word. Given any permutation in CC%(1) to CC%(N), a call to PROCPERMUTE will find the next permutation.

200 DEF PROCPERMUTE I=N-1: J=N
205 IF CC%(I)>=C%(I+1) I=I-1: GOTO 205
210 IF CC%(J)<=C%(I) J=J-1: GOTO 210
220 PROCSWAP
230 I=I+1:J=N:IF I=J ENDPROC
240 DO PROCSWAP: I=I+l: J=J-l:UNTIL I>=J
250 ENDPROC

PROCSWAP: Swap elements I and J.

300 DEF PROCSWAP T=CC%(I): CCS(I)=CCS(J)
310 CC%(J)=T: S=l-S: ENDPROC

PROCDISPLAY: Print permutation.

400 DEF PROCDISPLAY PRINT':FOR K=1TON
410 PRINT MID$(A$,CC%(K),CC%(K+1)); 
420 NEXT:ENDPROC

Variables:

A$  - Word
CC%(N) - Array being permuted; initially CC(N)=N
F   - Factorial N, number of permutations
H   - Current permutation number
I,J - Items to be interchanged to get new permutation
N   - Number of letters in word
S   - Sign of current permutation

Variables:

 


Atom Version

The Atom version is virtually identical to the BBC version given above, except that the '?' operator is used instead of MID$ to extract characters from the word in $A (or A$).

30 DIMA(64)
40 INPUT"WORD"$A

Set up array of letter positions such that CC%(1)=l, CC%(2)=2 ... CC%(N)=N. Calculate number of permutations, factorial N.

50 N=LENA;DIM CC(N)
60 A=A-1;F=1;S=1
100 FOR J=1TON;CCJ=J;F=F*J;NEXT
102 PRINT "THERE ARE" F " PERMUTATIONS"

Display first permutation. Then permute word, and display.

105 GOSUBd
110 FOR H=2TOF;GOSUBp;GOSUBd;NEXT
120 END

p - permute word. Given any permutation an CC%(1) to CC$(N), a call to subroutine p will find the next permutation.

200pI=N-1; J=N
205 IF CC(I)>=CC(I+1) I=I-1; GOTO 205 
210 IF CC(J)<=CC(I) J=J-1; GOTO 210
220 GOSUBs
230 I=I+1;J=N;IF I=J RETURN
240 DO GOSUBs; I=I+1; J=J-1;UNTIL I>=J
250 RETURN

s - Swap elements I and J.

300sT=CCI;CCI=CCJ;CCJ=T;S=l-S;RETURN

d - Print permutation.

400dPRINT ';FORK=1TON;PRINT $A?CCK;NEXT;RETURN

Variables:

$A - Word
CC(N) - Array being permuted; initially CC(N)=N
F - Factorial N, number of permutations
H - Current permutation number
I,J - Items to be interchanged to get new permutation 
N - Number of letters in word
S - Sign of current permutation

Further Suggestions

The present program will give LEE twice in a list of the anagrams of the word EEL. An improvement would be to include a test for repeated latters in the original word.

Although this program will find all the permutations of a sequence of letters very rapidly, it is a very inefficient way of discovering that, for example, ORCHESTRA is an anagram of CARTHORSE because there are no less than 362880 permutations to test, and a much simpler method could be devised to verify that two words share the same letters.