In this homework, we cover shift, substitution, and Vigenere ciphers. All problems are taken from the textbook (Stinson). Many thanks to Jim Wei and Eric Chung for sharing their solution files with us, which we have modified to form this Web page.
Problem 1.1. Below are given four examples of ciphertext, obtained from Substitution, Vigenere, Affine, and unspecified ciphers. Provide the plaintext and explain how you obtained the solution.
1.1 a) Substitution Cipher. The technique here is to compute the sorted histogram of both ciphertext and a similar plaintext corpus. You have the advantage in the latter case of Table 1.1 on page 26 of Stinson. By matching the first two quintiles of characters (to preserve a high signal-to-noise ratio), you can obtain some guesses about letters. Here is the ciphertext and plaintext juxtaposed, followed by the method Jim used to solve the problem:
Ptxt: imaynotbeabletogrowflowersbutmygardenproduces Ctxt: EMGLOSUDCGDNCUSWYSFHNSFCYKDPUMLWGYICOXYSIPJCK justasmanydeadleavesoldovershoespiecesofropea QPKUGKMGOLICGINCGACKSNISACYKZSCKXECJCKSHYSXCG ndbushelsofdeadgrassasanybodysandtodayibought OIDPKZCNKSHICGIWYGKKGKGOLDSILKGOIUSIGLEDSPWZU awheelbarrowtohelpinclearingitupihavealwayslo GFZCCNDGYYSFUSZCNXEOJNCGYEOWEUPXEZGACGNFGLKNS vedandrespectedthewheelbarrowitistheonewheele ACIGOIYCKXCJUCIUZCFZCCNDGYYSFEUEKUZCSOCFZCCNC dvehicleofwhichiamperfectmaster IACZEJNCSHFZEJZEGMXCYHCJUMGKUCY
Deciphered Plaintext:
Ctxt: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Freq: 5 37 8 12 9 24 5 15 7 18 7 5 13 10 6 1 20 14 5 7 15 13 Rank: 21 1 13 10 12 2 19 6 14 4 15 20 9 11 17 22 3 7 18 16 5 8 Ptxt: v e b i w a f d c s y m l n u j o y g p r h Analysis: C -> e - because C is most frequent Q -> j - because both only occur once Z -> h - There are 7 ZC's, but only 1 CZ, and HE is 2nd most frequent digram. - Also there are 4 ZCN's, and HER is 4-th most popular trigram. N -> l - A guess that worked U -> t - There are 2 UZC's, and THE is the most frequent trigram - Also 1 CU and 2 UC's (with TE and ET corresponding) are on the digram list, but their frequencies are low. S -> o - As in l-ved, o and i both fit, o was tried first and it worked. O -> n - GO occured 5 times, and is a frequent English digram. K -> s - K is 4th most popular letter and cannot be a vowel, otherwise we would frequently have three consecutive vowels (e.g., CKS). I -> d - ICGI, which decrypts to -ea- becomes "dead" Similarly,lea-e- is probably leaves. A -> v - From the I->d substitution, NCG-C is lea-e => leave W -> g - WYGKK => -rass, which is "grass". L -> y - An easy guess: alwa-s => always. X -> p - Since res-e-ted should be "respected" J -> c - As in the preceding substitution, respe-ted should be "respected" E -> i - An easy guess: veh-cle is "vehicle". P -> u - Another easy one: prod-ces is "produces". D -> b - "-ought" => "bought" and "wheel-arrow" => "wheelbarrow". M -> m - "I-aynot" => "I may not"The remainder of the substitutions were guesses worked out as before.
1.1b) Vigenere Cipher
Ciphertext:
KCCPKBGUFDPHQTYAVINRRTMVGRKDNBVFDETDGILTXRGUD DKOTFMBPVGEGLTGCKQRACQCWDNAWCRXIZAKFTLEWRPTYC QKYVXCHKFTPONCQQRHJVAJUWETMCMSPKQDYHJVDAHCTRL SVSKCGCZQQDZXGSFRLSWCWSJTBHAFSIASPRJAHKJRJUMV GKMITZHFPDISPZLVLGWTFPLKKEBDPGCEBSHCTJRWXBAFS PEZQNRWXCVYCGAONWDDKACKAWBBIKFTIOVKCGGHJVLNHI FFSQESVYCLACNVRWBBIREPBBVFEXOSCDYGZWPFDTKFQIY CWHJVLNHIQIBTKHJVNPISTMethod:
Step 2. Assuming the longest key, we compute the Index of Coincidence(Ic):
Column 2 3 4 5 6 7 8 1 0.044 0.064 0.049 0.057 0.079 0.050 0.056 2 0.0524 0.056 0.054 0.057 0.097 0.062 0.062 3 0.057 0.049 0.048 0.066 0.063 0.057 4 0.060 0.049 0.082 0.061 0.063 5 0.057 0.060 0.064 0.062 6 0.090 0.064 0.068 7 0.061 0.063 8 0.077This also provided strong evidence that the keyword length is 6, since the higher values occurred in row and column 6.
Step 3. We next compute the 390(15x26) Mutual Index of Coincidence (MIc), with the resulting relative shifts listed as:
K1 - K2 = 11 K1 - K3 = 4 K1 - K4 = 13 K1 - K5 = 9 K1 - K6 = 14 K2 - K3 = 19 K2 - K4 = 2 K2 - K5 = 24 K2 - K6 = 3 K3 - K4 = 9 K3 - K5 = 5 K3 - K6 = 10 K4 - K5 = 22 K4 - K6 = 1 K5 - K6 = 5All 15 relative shifts agree, so the keyword is the result of some shift applied to APWNRM.
Step 4. Of the 26 possible shifts of APWNRM, only one result made sense, namely CRYPTO (or A |-> C), whose inverse produced the following plaintext:
ilearnedhowtocalculatetheamountofpaperneededf oraroomwheniwasatschoolyoumultiplythesquarefo otageofthewallsbythecubiccontentsoftheflooran dceilingcombinedanddoubleityouthenallowhalfth etotalforopeningssuchaswindowsanddoorsthenyou allowtheotherhalfformatchingthepatternthenyou doublethewholethingagaintogiveamarginoferrora ndthenyouorderthepaperPlaintext:
1.1c) Affine Cipher The given ciphertext and plaintext are:
Ctxt: KQEREJEBCPPCJCRKIEACUZBKRVPKRBCIBQCARBJCVFCUP Ptxt: ocanadaterredenosaieuxtonfrontestceintdefleur KRIOFKPACUZQEPBKRXPEIIEABDKPBCPFCDCCAFIEABDKP onsglorieuxcartonbrassaitporterlepeeilsaitpor BCPFEQPKAZBKRHAIBKAPCCIBURCCDKDCCJCIDFUIXPAFF terlacroixtonhistoireestuneepopeedesplusbrill ERBICZDFKABICBBENEFCUPJCVKABPCYDCCDPKBCOCPERK antsexploitsettavaleurdefoitrempeeprotegerano IVKSCPICBRKIJPKABI sfoyersetnosdroitsThis is the Canadian national anthem in French, as might be sung from time to time in Quebec.
Ordr: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Ctxt: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Freq: 13 21 32 9 13 10 1 16 6 20 1 2 20 4 12 1 6 4 2 1 4 Rank: 6 2 1 10 7 9 5 11 3 4 8 12 Ptxt: i t e p a l w h s d o z k v g r c n y j u f q b m x
C --> e highest count. B --> t workedStep 3. Solution:
4a + b = 2 -> 15a = 8 -> a = 19, b = 4 19a + b = 10Step 4. Expression:
Encryption: ek(x) = 19x + 4
Decryption: dk(y) = 11(y-4) = 11y - 44
1.1d) Unspecified Cipher
Ciphertext:
BNVSNSIHQCEELSSKKYERIFJKXUMBGYKAMQLJTYAVFBKVT DVBPVVRJYYLAOKYMPQSCGDLFSRLLPROYGESEBUUALRWXM MASAZLGLEDFJBZAVVPXWICGJXASCBYEHOSNMULKCEAHTQ OKMFLEBKFXLRRFDTZXCIWBJSICBGAWDVYDHAVFJXZIBKC GJIWEAHTTOEWTUHKRQVVRGZBXYIREMMASCSPBNLHJMBLR FFJELHWEYLWISTFVVYFJCMHYUYRUFSFMGESIGRLWALSWM NUHSIMYYITCCQPZSICEHBCCMZFEGVJYOCDEMMPGHVAAUM ELCMOEHVLTIPSUYILVGFLMVWDVYDBTHFRAYISYSGKVSUU HYHGGCKTMBLRXMethod:
Step 2.Assuming a keyword length of 6, we produce the following 15 relative shifts:
K1 - K2 = 12 K1 - K3 = 15 K1 - K4 = 5 K1 - K5 = 2 K1 - K6 = 21 K2 - K3 = 3 K2 - K4 = 19 K2 - K5 = 16 K2 - K6 = 9 K3 - K4 = 16 K3 - K5 = 13 K3 - K6 = 6 K4 - K5 = 23 K4 - K6 = 16 K6 - K6 = 19All 15 shifts agree, and we find that the keyword is some shifted version of AOLVYF.
Step 3. A trial decryption that makes sense is THEORY, which produces the following trial plaintext:
igrewupamongslowtalkersmeninparticularwhodrop pedwordsafewatatimelikebeansinahillandwhenigo ttominneapoliswherepeopletookalakewobegoncomm atomeantheendofastoryicouldntspeakawholesente nceincompanyandwasconsiderednottoobrightsoien rolledinaspeechcoursetaughtbyorvillesandthefo underofreflexiverelaxologyaselfhypnotictechni quethatenabledapersontospeakuptothreehundredw ordsperminutethat, when formatted, appear as:
Problem 1.2
b) Let p be prime. Show that the number of 2x2 matrices that are invertible over Zp is given by N = (p2 - 1)(p2- p).
c) For p prime and m > 3 an integer, find a formula for the number of mxm matrices that are invertible over Zp.
Problem 1.4. Suppose we are told that the plaintext
conversationyields the ciphertext
HIARRTNUYTUSwhere the Hill Cipher is used but the keysize m is not specified. Determine the encryption matrix.
Ptxt Index: 02 14 13 21 04 17 18 00 19 08 14 13 Ctxt Index: 07 08 00 17 17 19 13 20 24 19 20 18Let gcd(detAmxm , 26) 1 . The following three cases suffice:
Case 1: Let m = 2 :
Case 2: Let m = 3 :
Case 3: Let m = 4 :
Problem 1.7. We describe a special case of a Permutation Cipher. Let m and n be positive integers. Write out the plaintext, by rows, in mxn rectangles. Then form the ciphertext by taking the columns of these rectangles. For example, if m = 4 and n = 3, then we would encrypt the plaintext "cryptography" by forming the following rectangle:
c r y p t o g r a p h yThe ciphertext would be
CTAROPYGHPRY
. a) Describe how Bob would decrypt a ciphertext, given values for m and n.
1 + 0*m, 1 + 1*n, 1 + 2*n, ......, 1 + (m-1)*n, 2 + 0*m, 2 + 1*n, ..............., 2 + (m-1)*n, . . . n + 0*m, n + 1*n, ..............., n + (m-1)*n
b) Decrypt the following ciphertext, which was obtained using the preceding method of encryption:
Ctxt: MYAMRARUYIQTENCTORAHROYWDSOYEOUARRGDERNOGW
MY AM RA maryma RU YI QT ryquit EN CT OR econtr AH RO YW aryhow DS OY EO doesyo UA RR GD urgard ER NO GW engrowThe formatted plaintext follows:
Mary, Mary, quite contrary, how does your garden grow?
Problem 1.11. We describe a stream cipher that is a modification of the Vigenere cipher...Each time we use the keyword we replace each letter by its successor modulo 26. For example, we use SUMMER to encrypt the first six letters, then TVNNFS to encrypt the second six letters, and so forth. Describe how you can use the concept of index of coincidence to first determine the length of the keyword, then actually find the keyword. Test your method by cryptanalyzing the following ciphertext:
IYMYSILONRFNCQXQJEDSHBUIBCJUZBOLFQYSCHATPEQGQ JEJNGNXZWHHGWFSUKULJQACZKKJOAAHGKEMTAFGMKVRDO PXNEHEKZNKFSKIFRQVHHOVXINPHMRTJPYWQGJWPUUVKFP OAWPMRKKQZWLQDYAZDRMLPBJKJOBWIWPSEPVVQMBCRYVC RUZAAOUMBCHDAGDIEMSZFZHALIGKEMJJFPCIWKRMLMPIN AYOFIREAOLDTHITDVRMSE
TRIGRAM:{position -> shift}+
:
ONR: 8 -> 0, 76 -> 19, 151 -> 12, 155 -> 24 JED: 17 -> 0, 154 -> 8, 219 -> 8 UZB: 28 -> 0, 114 -> 14, 118 -> 8 KJO: 71 -> 0, 106 -> 7, 160 -> 26 KEM: 78 -> 0, 201 -> 21, 208 -> 0 LQD: 147 -> 0, 161 -> 24, 224 -> 23Some of the intervals appear to be too small for the indicated shift, so we discard them. This leaves ten useful intervals, as follows:
intervals shift ONR: 68 19 75 19 JED: 137 8 65 0 UZB: 86 14 KJO: 35 5 54 19 KEM: 123 21 LQD: 14 24 63 25Unfortunately, this result is not terribly informative, except to indicate that the keyword length may either be 3, 5 or 7.
Step 2. We write a program shift.c
to reverse the
effect of shifting the
key, in effect making the ciphertext same as that off a normal Vigenere Cipher
encrypted ciphertext, assuming a known keyword length. Running the program
for different keyword lengths, we calculate the Index of Coincidence(Ic) for
each column of the modified ciphertext. For example, we first run
shift.c
on the original ciphertext for keyword length 5, then calculate the Ic for
each of the 5 columns using the result, as follows:
Ic Keyword Length Column 2 3 4 5 6 7 8 9 1 0.048 0.048 0.054 0.090 0.071 0.074 0.068 0.066 2 0.043 0.055 0.056 0.093 0.077 0.060 0.074 0.061 3 0.059 0.054 0.095 0.067 0.069 0.053 0.058 4 0.051 0.115 0.072 0.060 0.066 0.064 5 0.100 0.065 0.063 0.055 0.086 6 0.072 0.073 0.066 0.097 7 0.061 0.060 0.059 8 0.082 0.070 9 0.075This seems to indicate a keyword length of 5, since the highest values are located in column 5. (The Ic increases as the column grows, due to the smaller sample space, but at column 5 there is a significant maximum.)
Step 3. We compute the Mutual Index of Coincidence (MIc) for keyword length 5, and the result is less than clear:
K1 - K2 = 24 MIc = 0.079686 K1 - K3 = 18 MIc = 0.078870 K1 - K4 = 17 MIc = 0.087644 K1 - K5 = 22 MIc = 0.079890 K2 - K3 = 9 MIc = 0.077051 K2 - K4 = 5 MIc = 0.088713 K2 - K5 = 13 MIc = 0.083715 K3 - K4 = 3 MIc = 0.082466 K3 - K5 = 3 MIc = 0.083090 K4 - K5 = 23 MIc = 0.087880It is obvious that the relative shifts don't agree with each other. Either the keyword length is not 5, or the plaintext is too small or is highly variant spatially. Trying other keyword lengths (e.g., 3, 6, or 7), yields poor results, so we use m = 5 and try combinations of relative shifts to construct keywords.
The first trial is derived from the first 4 relative shifts that have K1, namely, ACIJE, which does not work.
Step 4. We try the relative shift K2 - K3 = 9. But which of the first two relative shifts should we discard? Getting rid of K1 - K2 = 24 yields ARIJE, which didn't work either. However, keeping K1 - K2 = 24 and discarding K1 - K3 = 18 yields ACTJE.
One of the 26 decryptions started with the following characters:
theaz stfox ousqc yptcw ogige inhwd tormz wesvt faapl esgeo whoeh edwot habeo whoeh esotd anreo ......It looked like the first three letters were right, since some blocks began with: the, who, who, in, an.
Step 5. Since the relative shifts K1 - K2 and K1 - K3 and K2 - K3 are assumed to be correct, the other relative shifts had something to do with the first three letters. First K2 - K4 = 5 yields ACTXC, which didn't work. Then, we tried K2 - K5 = 13, which yielded ACTJP and gave the trial plaintext:
theaostfomousqryptclogigtinhwsThe word fomous really looked like famous, and shifting K4 by 14 matched the relative shift K2 - K4 = 5. The new keyword is ACTXP which has as one of its shifted versions PRIME.
This produced the trial plaintext:
themostfamouscryptologistinhistoryoweshisfamelesstowhat hedidthantowhathesaidandtothesensationalwayinwhichhesaidit andthiswasmostperfectlyincharacterforherbertosborneyardley wasperhapsthemostengagingarticulateandtechnicolored personalityinthebusinesswhich, when formatted, becomes:
The most famous cryptologist in history owes his fame less to what he did than to what he said and to the sensational way in which he said it, and this was most perfectly in character, for Herbert Osborne Yardley was perhaps the most engaging articulate and technicolored personality in the business.
This concludes the solution for Homework #1, Fall 1996. If you have a solution that you'd like us to review (and possibly post on this Web page), please feel free to submit an ASCII or HTML file via E-mail to Dr. Schmalz.