Outline of the talk

The Code Breakers

by Bill Velez and David Lovelock

A Cryptography Presentation for Math Awareness Week

April 22, 1997

Brief introductions of Bill and David. Mention that the subject of cryptography is this year's theme of the Southwest Regional Institute in the Mathematical Sciences.

We begin by stating that we want to develop a system whereby two individuals, say David and Bill, want to communicate with each other. The messages that are sent between these two individuals are sent over public channels, for example, telephone lines. So, eavesdroppers have access to these messages. We would like to be able to encode our messages so that eavesdroppers would not be able to understand our messages. So, these messages must be :written in code. But the other party must be able to decode the message, so the method of encoding and decoding must be agreed upon beforehand. So, we will assume that there is a secure means of communication. Question to the Audience: If there is a secure channel of communication between the two parties, then why is a secret code necessary?

It is a generally held belief that when attempting to break a cipher, the unauthorized party knows the method of encryption.

A first attempt at encoding is a simple substitution cipher. In this cipher, each letter is replaced by another letter, leaving spaces and punctuation unchanged. For example,

BPQA UMAAIOM QA AMKZMB
Question to the Audience: Can you decipher this message?

One way to decipher a long document which has been enciphered with a simple substitution cipher is to perform a frequency count, that is, go through the code counting the number of A's, B's, etc. The most common letter in the English language is E, followed by T, A, I, O, and N. This suggests that we replace the most frequent letter in the enciphered message by E, the next most frequent by T, and so on.

We have an 800 page document that has been enciphered by a simple substitution code. We want to count all A's, B's, etc. How do we do this? Not by hand, if we can avoid it. Fortunately we have the text of the document stored as a file on a computer, so we write a short program to do all the counting work for us. When we count the frequency of each of the letters in this enciphered message in this way, we find

A 130847    B  41879    C  13701    D  38434    E   2069
F  33367    G    303    H 120845    I  22553    J  33225
K  69495    L 178742    M  32105    N  31076    O  89599
P 107177    Q   1422    R  13264    S  56051    T  46575
U 101231    V 114489    W  25102    X   1392    Y  85096
Z  89435    

This suggests that
L should be replaced by E
A should be replaced by T
H should be replaced by A
V should be replaced by O
P should be replaced by I
U should be replaced by N

Part of the 800 page message is

DOLAOLY P ZOHSS ABYU VBA AV IL AOL OLYV VM TF VDU SPML, VY
DOLAOLY AOHA ZAHAPVU DPSS IL OLSK IF HUFIVKF LSZL, AOLZL
WHNLZ TBZA ZOVD.  AV ILNPU TF SPML DPAO AOL ILNPUUPUN VM TF
SPML, P YLJVYK AOHA P DHZ IVYU
If we make the suggested replacements, using a program such as CRYPTAID, we see that
DOLAOLY P ZOHSS ABYU VBA AV IL AOL OLYV VM TF VDU SPML, VY
  et e  i   a   t  n o t to  e t e  e o o     o n  i e, o
DOLAOLY AOHA ZAHAPVU DPSS IL OLSK IF HUFIVKF LSZL, AOLZL
  et e  t at  tation  i    e  e      an  o   e  e, t e e
WHNLZ TBZA ZOVD.  AV ILNPU TF SPML DPAO AOL ILNPUUPUN VM TF
 a e     t   o .  to  e in     i e  it  t e  e innin  o    
SPML, P YLJVYK AOHA P DHZ IVYU
 i e, i  e o   t at i  a   o n
Can you finish decoding the message? The actual text which had been encoded was David Copperfield, by Charles Dickens. The message is part of the first paragraph of Chapter 1.

If the message is short then frequency analysis is not as powerful as the previous example suggests. However, pattern analysis is. To explain pattern analysis, consider the word LOVELOCK. It has a pattern to it. The first and fifth letters are the same as are the second and sixth. All other letters are different. Thus, LOVELOCK has the pattern ABCDABEF. There are not very many words in the English language with this pattern. Two are DECADENT and TOMATOES. Can you think of any more? Thus, we might be able to solve a shorter message by looking at the pattern of the letters. For example, consider the message

IXEVZUYEYZKSY NGBK INGTMKJ XKIKTZRE JAK ZU ZNK OTZXUJAIZOUT
UL NKGBE JAZE IUSVAZOTM GTJ GRYU ZNK AYK UL GHYZXGIZ OJKGY
LXUS TASHKX ZNKUXE. ZNK OTZKXTKZ NGY GRYU TKIKYYOZGZKJ ZNK
OTZXUJAIZOUT UL ZNKYK TKC OJKGY.
Here the word OTZXUJAIZOUT appears twice, and has the pattern ABCDEFGHCAEB. There are very few English words with this pattern. Again using CRYPTAID we find that "introduction" has this pattern, so we fill it in, and find
IXEVZUYEYZKSY NGBK INGTMKJ XKIKTZRE JAK ZU ZNK OTZXUJAIZOUT
cr  to   t         c  n  d r c nt   du  to t   introduction
UL NKGBE JAZE IUSVAZOTM GTJ GRYU ZNK AYK UL GHYZXGIZ OJKGY
o        dut  co  utin   nd    o t   u   o     tr ct id   
LXUS TASHKX ZNKUXE. ZNK OTZKXTKZ NGY GRYU TKIKYYOZGZKJ ZNK
 ro  nu   r t  or . t   int rn t        o n c   it t d t   
OTZXUJAIZOUT UL ZNKYK TKC OJKGY.
introduction o  t     n   id   .
The word IUSVAZOTM has no repeated letters, and there are very few nine-letter English words with no repeated letters. There are even fewer when we realize that the English word must be constructed from "co utin ". CRYPTAID suggests "computing", and so we now have
IXEVZUYEYZKSY NGBK INGTMKJ XKIKTZRE JAK ZU ZNK OTZXUJAIZOUT
cr pto   t m       c  ng d r c nt   du  to t   introduction
UL NKGBE JAZE IUSVAZOTM GTJ GRYU ZNK AYK UL GHYZXGIZ OJKGY
o        dut  computing  nd    o t   u   o     tr ct id
LXUS TASHKX ZNKUXE. ZNK OTZKXTKZ NGY GRYU TKIKYYOZGZKJ ZNK
 rom num  r t  or . t   int rn t        o n c   it t d t
OTZXUJAIZOUT UL ZNKYK TKC OJKGY.
introduction o  t     n   id   .
Can you finish decoding the message?

There are many other ways of encoding, including, deleting spaces and punctuation, and many other tricks. Deleting spaces and punctuation defeats pattern analysis but not frequency count.

If we could change the substitution method very frequently, delete spaces and punctuation, then perhaps we could devise a secure method of encryption.

During the years preceding World War II, this question was addressed . The value of a commodity changes when one goes into wartime. For example, in peacetime, solving puzzles is a pastime, yet in wartime it becomes an industry. Several ideas were developed because of the war, for example, radar and weather forecasting.

A Brief History of the Enigma Machine

We have been studying the simple substitution ciphers. As we have seen, moderately long messages can be deciphered with ease by using the characteristics of the language that is being used. In the early 1900's, individuals began to develop other mechanical means of encrypting messages, many of which used rotors. As we will see, the idea of mechanical rotors was in the air.

The inventor of the first machine to embody the rotor principle was Edward Hugh Hebern (born on April 23, 1869 in Streator, Illinois). In 1917, Hebern reduced his ideas to the first drawings made of a rotor system. In 1931, the U. S. Navy purchased 31 of his machines and they were issued by the more important flag officers and were the top cryptographic system in the Navy. Two of these machines were captured by the Japanese Navy during World War II.

Alexander Koch filed a patent on Tuesday, October 7, 1919 in the Netherlands for a rotor machine. In 1927, he assigned the patent rights to the German inventor of a rotor device Arthur Scherbius. On Friday, of the same week that Koch filed his patent, Arvid Gerhard Damm filed his patent in Stockholm.

We will now show a diagram and pictures of the Enigma machine.

Bring up the software version of the Enigma machine and show a page of the code book. Go through a couple of examples of encoding. We will then explain that the existence of a reflecting rotor gives the additional property that encoding is the same as decoding.

During the 1930's, the Polish got access to an Enigma machine. They figured out how it worked, built special purpose computing machines to break this code. They turned over their information to the British. The British recruited individuals from the university system and set them up at Bletchley Park. The government also provided material support and the Code breakers at Bletchley Park were able to construct special purpose computers, called BOMBES, to break the code. First it was required to determine the wiring of the rotors that were being used. It was also necessary to determine the settings for the day. Finding codebooks, seeding information, keeping track of the coding behaviors of the cipher clerks were all techniques that were used.