| | 
|
| If
you have got a message encrypted using the substitution cipher that you want to
crack, you can use frequency analysis. In other words, if the sender has tried
to disguise a letter by replacing with a different letter, you can still recognise
the original letter because the frequency characteristics of the original letter
will be passed on to the new letters. To
apply frequency analysis, you will need to know the frequency of every letter
in the English alphabet, or the frequency of letters in whichever language the
sender is using. Below
is a list of average frequencies for letters in the English language. So, for
example, the letter E accounts for 12.7% of all letters in English, whereas Z
accounts for 0.1 %. All the frequencies are tabulated and plotted below.
|
 |
| | |
Please
note, these frequencies are averages, and E will not always constitute 12.7 %
of all the letters in a text, and may not even be the most common letter. The
longer the message, the more likely it is that will obey the average distribution
shown above. However, there are exceptions to this rule. In 1969, the French author
Georges Perec managed to write a 200-page book called 'La Disparition' without
using any words containing the letter E. Amazingly, the book was later translated
into English by Gilbert Adair, again avoiding the use of the letter E. | | |
You
can check the frequency distribution of a piece of text by entering it in the
box below - just cut and paste a piece of text, perhaps an essay or some text
from the web. Then click on the Count Letter Frequency button.
If you analyse just one sentence, its letter frequency distribution probably will
not match the bar chart above. Or if you select text that is not in English,
the match will not be good. But pick a longish English text, and whether
it's Shakespeare or The Sun, the match should be surprisingly good. |
|
|
| | |