update,

2025-01-31 19:28:21 +08:00
parent ce9a4aa9b3
commit 72bacdd6b5
168 changed files with 939668 additions and 0 deletions
--- a/banson_hker/phase1-fix/doc/report.md
+++ b/banson_hker/phase1-fix/doc/report.md
@@ -0,0 +1,216 @@
+# Report:
+
+# describe the components involved
+
+
+## Purpose / Goals:
+
+Implement a Shift Cipher Decrypter using Python. 
+User can decrypt a long enough (more than 200 words) message which was originally `shift cipher encrypted`.
+
+For demonstration purpose, the remaining half "encryption" was also implemented. (which is the vice versa of decryption) 
+
+
+## Assumption / Requirement:
+
+Assuming a use case, for which message to be encrypted:
+
+    - Only contains upper case letters, space characters, punctuation marks
+    - Space characters remain unchanged during the encryption.
+    - punctuation marks remain unchanged during the encryption.
+    - message is longer than 200 words, alphabet distribution follows the general pattern
+
+
+## Procedure / Terminology:
+
+Given that the `encrypted` message is long enough. 
+A shift cipher decrypt can guess the message without knowing k.　
+The most frequent letter will be ‘E’ which is also reflected in the `encryped` message as letter shifting of the whole message(`encrypted`) are the same.
+
+
+## data types / variable declaration and initialization (data type used)
+
+### INTEGER
+
+i.e.
+
+```python
+ORD_a = ord('a') # 97
+```
+
+`ORA_a` is a variable storing `97` in integer format
+
+### STRING AND ARRAYS
+
+```python
+...    
+    with open(file_path,'r') as fi:
+        # beginning of the process
+        # read file and join the lines all
+        lines = fi.readlines()
+        e_temp = ''.join(lines)
+...
+```
+
+at here:
+- `lines` is an array of string.
+- `e_temp` is a single lined string.
+
+
+## data collection, input and validation/ data processing
+
+### User input:
+
+console
+
+<place a menu screen capture here>
+
+<place a menu screen capture here>
+
+press `1` to start a encryption (encrypt file)
+    - select a key you want (in numeric format)
+
+<place a menu screen capture here>
+
+press `2` to start a decryption (decrypt file)
+
+<place a menu screen capture here>
+
+press `q` to quit
+
+
+## modularity/ reusability/ portability
+
+(TODO: need to capture from text book)
+
+### Terminology
+
+- decryption algorithm (k guessing)
+
+### Background:
+
+Letter `e` is counted to have the most occurrence in daily english. As the process of `shift xxx encryption` is shifting letter by k(a unknown integer) times. Assuming the k used for the whole message are the same (or even in a regular pattern that already known). The letter occurrence will be reflected in the `encrypted` message (i.e. `e` -> k's shift -> `m` for this case) as well. That's why the k can be guessed by counting the most occurrence letter and assuming that is the letter `e` in the original message.
+
+### Implementation
+
+```python
+need to replace this !!!
+need to review comments !!!
+
+def find_max_occurrence(char_occurrences):
+    # find distance to the letter e (case in-sensitive)
+
+    # get the letter of the most occurrences. i.e. m
+    # by subtract between this letter to e, k can be guess
+
+    # find max occurrence and its index
+    max_idx = char_occurrences.index(max(char_occurrences))
+
+    # subtract it with index of e -> 4
+    return max_idx - 4
+
+def count_letter_occurrence(txt_in):
+    # letter e, as stated have the most occurrence in the message by statistics.
+    # as 'Shift Cipher' is a encryption by letter shifting, the letters have good chance 
+    # to have the most occurrence too in the encrypted text.
+    output = [0] * 26    # bucket for 26 letters
+
+    for char in txt_in:
+        if char.isalpha():
+            output[ord(char.lower()) - ORD_a] += 1
+
+    # output contains the statistics of paragraph letter by letter
+    return output
+
+...
+
+```
+
+<diagram showing for loop >
+
+
+```python
+need to replace this !!!
+need to review comments !!!
+
+def decrypt_file(file_path):
+    # will open an encrypted file and decrypt it by a guessed key
+    
+    with open(file_path,'r') as fi:
+        # beginning of the process
+        # read file and join the lines all
+        lines = fi.readlines()
+        e_temp = ''.join(lines)
+
+        characters_distribution = count_letter_occurrence(e_temp)
+
+        print('')
+        print('distribution of letters in encrypted text (case insensitive, from a to z)')
+        print(characters_distribution)
+
+        print('')
+        guess_k = find_max_occurrence(characters_distribution)
+        print(f'guessed k: {guess_k}')
+
+        print('')
+        print('decrypted text:')
+        decrypted_text = shift_cipher_decrypt(e_temp, guess_k)
+        print(decrypted_text)
+```
+
+<diagram showing for loop >
+
+```python
+need to replace this !!!
+need to review comments !!!
+
+
+def shift_cipher_decrypt(ciphertext, key):
+    plaintext = ""
+    
+    for char in ciphertext:
+        if char.isalpha():
+            ascii_offset = ORD_a if char.islower() else ORD_A  # Determine ASCII offset based on lowercase or uppercase letter
+            
+            # Calculate the distance of the target character from a or A
+            distance = ord(char) - ascii_offset
+
+            # Reverse the shift by subtracting the key and taking modulo 26 to wrap around 
+            shifted_distance = (distance - key) % 26
+
+            # Convert back to ASCII by adding the offset and get the corresponding character
+            decrypted_char = chr(shifted_distance + ascii_offset)
+
+            plaintext += decrypted_char
+        else:
+            # If it is not an alphabetic character, retain as is.
+            plaintext += char
+    
+    return plaintext
+```
+
+<diagram showing for loop >
+
+
+## progress
+
+### section 1
+    - week 1 
+    - week 2 
+    
+### section 2
+    - week 3 
+    - week 4 
+    - week 5 
+
+
+## References
+    - [](https://github.com/dwyl/english-words)
+### section 1
+    - [blablabal](http://www.google.com)
+    - [blablabal](http://www.google.com)
+
+### section 2
+    - [blablabal](http://www.google.com)
+    - [blablabal](http://www.google.com)
+    - [blablabal](http://www.google.com)