This commit is contained in:
louiscklaw
2025-01-31 19:28:21 +08:00
parent ce9a4aa9b3
commit 72bacdd6b5
168 changed files with 939668 additions and 0 deletions

View File

@@ -0,0 +1,216 @@
# Report:
# describe the components involved
## Purpose / Goals:
Implement a Shift Cipher Decrypter using Python.
User can decrypt a long enough (more than 200 words) message which was originally `shift cipher encrypted`.
For demonstration purpose, the remaining half "encryption" was also implemented. (which is the vice versa of decryption)
## Assumption / Requirement:
Assuming a use case, for which message to be encrypted:
- Only contains upper case letters, space characters, punctuation marks
- Space characters remain unchanged during the encryption.
- punctuation marks remain unchanged during the encryption.
- message is longer than 200 words, alphabet distribution follows the general pattern
## Procedure / Terminology:
Given that the `encrypted` message is long enough.
A shift cipher decrypt can guess the message without knowing k. 
The most frequent letter will be E which is also reflected in the `encryped` message as letter shifting of the whole message(`encrypted`) are the same.
## data types / variable declaration and initialization (data type used)
### INTEGER
i.e.
```python
ORD_a = ord('a') # 97
```
`ORA_a` is a variable storing `97` in integer format
### STRING AND ARRAYS
```python
...
with open(file_path,'r') as fi:
# beginning of the process
# read file and join the lines all
lines = fi.readlines()
e_temp = ''.join(lines)
...
```
at here:
- `lines` is an array of string.
- `e_temp` is a single lined string.
## data collection, input and validation/ data processing
### User input:
console
<place a menu screen capture here>
<place a menu screen capture here>
press `1` to start a encryption (encrypt file)
- select a key you want (in numeric format)
<place a menu screen capture here>
press `2` to start a decryption (decrypt file)
<place a menu screen capture here>
press `q` to quit
## modularity/ reusability/ portability
(TODO: need to capture from text book)
### Terminology
- decryption algorithm (k guessing)
### Background:
Letter `e` is counted to have the most occurrence in daily english. As the process of `shift xxx encryption` is shifting letter by k(a unknown integer) times. Assuming the k used for the whole message are the same (or even in a regular pattern that already known). The letter occurrence will be reflected in the `encrypted` message (i.e. `e` -> k's shift -> `m` for this case) as well. That's why the k can be guessed by counting the most occurrence letter and assuming that is the letter `e` in the original message.
### Implementation
```python
need to replace this !!!
need to review comments !!!
def find_max_occurrence(char_occurrences):
# find distance to the letter e (case in-sensitive)
# get the letter of the most occurrences. i.e. m
# by subtract between this letter to e, k can be guess
# find max occurrence and its index
max_idx = char_occurrences.index(max(char_occurrences))
# subtract it with index of e -> 4
return max_idx - 4
def count_letter_occurrence(txt_in):
# letter e, as stated have the most occurrence in the message by statistics.
# as 'Shift Cipher' is a encryption by letter shifting, the letters have good chance
# to have the most occurrence too in the encrypted text.
output = [0] * 26 # bucket for 26 letters
for char in txt_in:
if char.isalpha():
output[ord(char.lower()) - ORD_a] += 1
# output contains the statistics of paragraph letter by letter
return output
...
```
<diagram showing for loop >
```python
need to replace this !!!
need to review comments !!!
def decrypt_file(file_path):
# will open an encrypted file and decrypt it by a guessed key
with open(file_path,'r') as fi:
# beginning of the process
# read file and join the lines all
lines = fi.readlines()
e_temp = ''.join(lines)
characters_distribution = count_letter_occurrence(e_temp)
print('')
print('distribution of letters in encrypted text (case insensitive, from a to z)')
print(characters_distribution)
print('')
guess_k = find_max_occurrence(characters_distribution)
print(f'guessed k: {guess_k}')
print('')
print('decrypted text:')
decrypted_text = shift_cipher_decrypt(e_temp, guess_k)
print(decrypted_text)
```
<diagram showing for loop >
```python
need to replace this !!!
need to review comments !!!
def shift_cipher_decrypt(ciphertext, key):
plaintext = ""
for char in ciphertext:
if char.isalpha():
ascii_offset = ORD_a if char.islower() else ORD_A # Determine ASCII offset based on lowercase or uppercase letter
# Calculate the distance of the target character from a or A
distance = ord(char) - ascii_offset
# Reverse the shift by subtracting the key and taking modulo 26 to wrap around
shifted_distance = (distance - key) % 26
# Convert back to ASCII by adding the offset and get the corresponding character
decrypted_char = chr(shifted_distance + ascii_offset)
plaintext += decrypted_char
else:
# If it is not an alphabetic character, retain as is.
plaintext += char
return plaintext
```
<diagram showing for loop >
## progress
### section 1
- week 1
- week 2
### section 2
- week 3
- week 4
- week 5
## References
- [](https://github.com/dwyl/english-words)
### section 1
- [blablabal](http://www.google.com)
- [blablabal](http://www.google.com)
### section 2
- [blablabal](http://www.google.com)
- [blablabal](http://www.google.com)
- [blablabal](http://www.google.com)