Monday, January 11, 2010

Don't know much 'bout cryptography

The show: Medium
The episode: "The Medium is the Message," first aired on CBS on October 16, 2009, and reran last week in a different time slot (there was a new episode on Friday).
What happened: Allison starts seeing strange symbols everywhere. Her visions and dreams lead her to the infamous Libra Slayer, a serial killer who long ago murdered young women and left coded messages disclosing his next victim, and who has just now resurfaced. The strange symbols are so pervasive for Allison that she even writes a note for one of her daughter's teachers using the symbols instead of the English alphabet. Davalos is predictably skeptical when Allison tells him the Libra has resurfaced, but must take her seriously when a new victim shows up with a coded message.
Scanlon in this episode is less skeptical of Allison, and when he finds a bookshelf filled with books on cryptography in the house of Neal (Fisher Stevens), the man Allison says is the Libra, he sounds convinced that they're on the right track. However, the clock is ticking as it takes the police hours to decode the Libra's messages and they know the Libra will strike again soon because they have a new message that needs decoding. Just in the nick of time, Allison digs up the weird note she wrote for her daughter and creates the key with which to decode the message and figure out the Libra's next victim.
What has me wondering: Both times I've watched this episode, the scene of Scanlon and the bookshelf bothered me, and I wasn't sure why. Now that I've thought about it more, it was the scene of Allison creating the key to decode the message that should've bothered me. At first I was like, well, I have a bookshelf of Bibles and books about the Bible, but it doesn't make me a theologian. A bookshelf of books about cryptography doesn't make Neal a cryptologist. However, for his purposes as a serial killer, Neal doesn't want to create unbreakable codes. He wants to create codes that can be solved but only after he has committed the crime. I don't know much about cryptography, but I do know that there are encryption schemes that would be suitable for Neal's purpose. And I also know that a simple substitution cypher is not one of them!
See, in a substitution cypher, all you're doing is replacing each instance of a symbol with another. Take for example the message "I will kill Bill." Replace each letter with the next letter in the alphabet, make them all uppercase, remove spaces, and you get "JXJMMLJMMCJMM." Notice that the letter M occurs a lot in the coded message, because L occurred a lot in the plaintext. A short message like that might not be as susceptible to frequency analysis, especially if the author contrives it to have a higher than normal proportion of certain letters, but it's short enough that you can try every possibility until you hit the one that makes sense. That might take ten minutes by hand. A longer message in English will have a greater frequency of E than of L. (Try it with my previous blog post).
That means that a longer message, like the ones the Libra Slayer writes, should also be susceptible to frequency analysis if they are coded with simple substitution cyphers. The fact that Allison was able to create a key for the message proves that the Libra's message, at least his latest one, was in fact a substitution cypher. It's true that he used strange symbols instead of English letters. That can slow down computer entry. But here's what I would do: I would just sequentially assign letters to the symbols as I go over the message. Let's say the first symbol of the message looks like a half moon with a horizontal slash. All the half moons with slashes in the message are now As. Say the second symbol is a square divided into nine smaller squares. Guess what? All the squares with nine squares inside them just became Bs. And so on and so forth. That might take ten minutes, doing it slowly so as to be sure to do it correctly.
With that done, I can transcribe the message into English letters. It still looks like gibberish, but now I can feed that gibberish into a computer and have it do a frequency analysis, and if that fails (if the author was clever enough to avoid the letter E in the plaintext), I can just have the computer try all 25 possibilities and see if any of them make sense. Why does that take the police 10 hours to do? No wonder the police have a reputation for incompetence.

No comments: