Letter frequency analysis for the English language

written on Saturday, January 5, 2013

When I was a kid I loved solving books full of puzzle ciphers. Not entirely sure who used to even publish them? Perhaps Usborne?

Well the first (and usually only) attack required was frequency analysis. Which is a pretty simple matter of counting the number of times each symbol appears and comparing that with the standard distribution that I had learnt somewhere.

Well apparently that standard distribution was worked out by a guy called Mark Mayzner in the 60s and in an extremely manual fashion.

Well Peter Norvig (Director of Research at Google) just updated his analysis, at Mark's request, using a text that contained 3.5 trillion characters to provide an updated english letter frequency table.

This entry was tagged codes and learning