# Report: # describe the components involved ## Purpose / Goals: Implement a Shift Cipher Decrypter using Python. User can decrypt a long enough (more than 200 words) message which was originally `shift cipher encrypted`. For demonstration purpose, the remaining half "encryption" was also implemented. (which is the vice versa of decryption) ## Assumption / Requirement: Assuming a use case, for which message to be encrypted: - Only contains upper case letters, space characters, punctuation marks - Space characters remain unchanged during the encryption. - punctuation marks remain unchanged during the encryption. - message is longer than 200 words, alphabet distribution follows the general pattern ## Procedure / Terminology: Given that the `encrypted` message is long enough. A shift cipher decrypt can guess the message without knowing k.怀 The most frequent letter will be ā€˜E’ which is also reflected in the `encryped` message as letter shifting of the whole message(`encrypted`) are the same. ## data types / variable declaration and initialization (data type used) ### INTEGER i.e. ```python ORD_a = ord('a') # 97 ``` `ORA_a` is a variable storing `97` in integer format ### STRING AND ARRAYS ```python ... with open(file_path,'r') as fi: # beginning of the process # read file and join the lines all lines = fi.readlines() e_temp = ''.join(lines) ... ``` at here: - `lines` is an array of string. - `e_temp` is a single lined string. ## data collection, input and validation/ data processing ### User input: console press `1` to start a encryption (encrypt file) - select a key you want (in numeric format) press `2` to start a decryption (decrypt file) press `q` to quit ## modularity/ reusability/ portability (TODO: need to capture from text book) ### Terminology - decryption algorithm (k guessing) ### Background: Letter `e` is counted to have the most occurrence in daily english. As the process of `shift xxx encryption` is shifting letter by k(a unknown integer) times. Assuming the k used for the whole message are the same (or even in a regular pattern that already known). The letter occurrence will be reflected in the `encrypted` message (i.e. `e` -> k's shift -> `m` for this case) as well. That's why the k can be guessed by counting the most occurrence letter and assuming that is the letter `e` in the original message. ### Implementation ```python need to replace this !!! need to review comments !!! def find_max_occurrence(char_occurrences): # find distance to the letter e (case in-sensitive) # get the letter of the most occurrences. i.e. m # by subtract between this letter to e, k can be guess # find max occurrence and its index max_idx = char_occurrences.index(max(char_occurrences)) # subtract it with index of e -> 4 return max_idx - 4 def count_letter_occurrence(txt_in): # letter e, as stated have the most occurrence in the message by statistics. # as 'Shift Cipher' is a encryption by letter shifting, the letters have good chance # to have the most occurrence too in the encrypted text. output = [0] * 26 # bucket for 26 letters for char in txt_in: if char.isalpha(): output[ord(char.lower()) - ORD_a] += 1 # output contains the statistics of paragraph letter by letter return output ... ``` ```python need to replace this !!! need to review comments !!! def decrypt_file(file_path): # will open an encrypted file and decrypt it by a guessed key with open(file_path,'r') as fi: # beginning of the process # read file and join the lines all lines = fi.readlines() e_temp = ''.join(lines) characters_distribution = count_letter_occurrence(e_temp) print('') print('distribution of letters in encrypted text (case insensitive, from a to z)') print(characters_distribution) print('') guess_k = find_max_occurrence(characters_distribution) print(f'guessed k: {guess_k}') print('') print('decrypted text:') decrypted_text = shift_cipher_decrypt(e_temp, guess_k) print(decrypted_text) ``` ```python need to replace this !!! need to review comments !!! def shift_cipher_decrypt(ciphertext, key): plaintext = "" for char in ciphertext: if char.isalpha(): ascii_offset = ORD_a if char.islower() else ORD_A # Determine ASCII offset based on lowercase or uppercase letter # Calculate the distance of the target character from a or A distance = ord(char) - ascii_offset # Reverse the shift by subtracting the key and taking modulo 26 to wrap around shifted_distance = (distance - key) % 26 # Convert back to ASCII by adding the offset and get the corresponding character decrypted_char = chr(shifted_distance + ascii_offset) plaintext += decrypted_char else: # If it is not an alphabetic character, retain as is. plaintext += char return plaintext ``` ## progress ### section 1 - week 1 - week 2 ### section 2 - week 3 - week 4 - week 5 ## References - [](https://github.com/dwyl/english-words) ### section 1 - [blablabal](http://www.google.com) - [blablabal](http://www.google.com) ### section 2 - [blablabal](http://www.google.com) - [blablabal](http://www.google.com) - [blablabal](http://www.google.com)