update,

2025-02-01 02:02:14 +08:00
parent a767348238
commit c403fa8e72
48 changed files with 5987 additions and 0 deletions
--- a/2nd_copy/NOTES.md
+++ b/2nd_copy/NOTES.md
@@ -0,0 +1,112 @@
+### objective
+
+env: Windows
+deadline 23/12
+
+### CAUTION
+
+Do not include any code not written by you in your
+project. You are NOT allowed to import any Python libraries in your solution except the
+modules namely os (https://docs.python.org/3/library/os.html), sys
+(https://docs.python.org/3/library/sys.html) and csv (https://docs.python.org/3/library/csv.html). If
+cheating is found or the import requirement is violated, you will receive a zero mark.
+
+### Deliverable
+
+You have to include your student name and ID in your source code and name your project solution as
+“XXXXXXXX_project.py” (where XXXXXXXX is your 8-digit student ID). Please remember to
+upload your source code solution to Moodle by the submission deadline.
+
+### drill down
+
+Functions
+Given the file of stock prices, you are asked to develop a Python program to process the data by
+designing appropriate functions. At minimum you need to implement and call the following three
+functions:
+• get_data_list(csv_file_name)
+This function has one parameter, namely csv_file_name. When the function is called, you
+need to pass along a CSV file name which is used inside the function to open and read the CSV
+file. After reading each row, it will be split into a list. The list will then be appended into a main
+list (a list of lists), namely data_list. The data_list will be returned at the end of the
+function.
+2
+• get_monthly_averages(data_list)
+This function has one parameter, namely data_list. You need to pass the data_list
+generated by the get_data_list() function as the argument to this function and then
+calculate the monthly average prices of the stock. The average monthly prices are calculated in
+the following way. Suppose the volume and adjusted closing price of a trading day are V1 and C1,
+respectively. The total sale of that day equals V1 x C1. Now, suppose the volume and adjusted
+closing price of another trading day are V2 and C2, respectively. The average of these two trading
+days is the sum of the total sales divided by the total volume:
+Average price = (V1 x C1 + V2 x C2) / (V1 + V2)
+To average a whole month, you need to add up the total sales (V1 x C1 + V2 x C2 + ... +
+Vn x Cn) for each day and divide it by the sum of all volumes (V1 + V2 + ... + Vn) where n
+is the number of trading days in the month.
+A tuple with 2 items, including the date (year and month only) and the average for that month,
+will be generated for each month. The tuple for each month will be appended to a main list,
+namely monthly_averages_list. The monthly_averages_list will be returned at
+the end of the function.
+• get_moving_averages(monthly_averages_list)
+This function has one parameter, namely monthly_averages_list. You need to pass the
+monthly_averages_list generated by get_monthly_averages() as the argument
+to this function and then calculate the 5-month exponential moving average (EMA) stock prices.
+In general, the EMA for a particular month can be calculated by the following formula:
+EMA = (Monthly average price – previous month’s EMA) x smoothing constant
+
+- previous month’s EMA
+  where
+  smoothing constant = 2 / (number of time periods in months + 1)
+  3
+  For example, the following table shows the stock prices between Oct 2020 and Apr 2021:
+  Month Monthly Average Price
+  Oct 2020 14
+  Nov 2020 13
+  Dec 2020 14
+  Jan 2021 12
+  Feb 2021 13
+  Mar 2021 12
+  Apr 2021 11
+  The initial 5-month EMA for Feb 2021 can be calculated by the simple average formula, as
+  shown below:
+  5-month EMA for Feb 2021 = (14 + 13 + 14 + 12 + 13) / 5 = 13.2
+  The 5-month EMA for Mar 2021 can be calculated by the EMA formula, as shown below:
+  5-month EMA for Mar 2021 = (Monthly average price – previous month’s EMA) x
+  smoothing constant + previous month’s EMA
+  = (12 – 13.2) x (2 / 6) + 13.2
+  = 12.8
+  The 5-month EMA for Apr 2021 can be calculated by the EMA formula, as shown below:
+  5-month EMA for Apr 2021 = (Monthly average price – previous month’s EMA) x
+  smoothing constant + previous month’s EMA
+  = (11 – 12.8) x (2 / 6) + 12.8
+  = 12.2
+  The resulting 5-month EMA stock prices are shown below:
+  Month Average Price 5-month EMA Price
+  Oct 2020 14 -
+  Nov 2020 13 -
+  Dec 2020 14 -
+  Jan 2021 12 -
+  Feb 2021 13 13.2
+  Mar 2021 12 12.8
+  Apr 2021 11 12.2
+  4
+  A tuple with 2 items, including the date (year and month only) and the 5-month EMA price for
+  that month, will be generated for each month except the first 4 months. Each tuple will be
+  appended to a main list, namely moving_averages_list. The
+  moving_averages_list will be returned at the end of the function.
+
+Program Input and Output
+At the outset, your program needs to ask the user for a CSV file name:
+Based on the entered CSV file name, a corresponding output text file (e.g. “Google_output.txt”
+for this case) will be generated. In the output file, you are eventually required to print the best month
+(with the highest EMA price) and the worst month (with the lowest EMA price) for the stock. You
+need to first print a header line for the stock, and then print a date (MM-YYYY), a comma followed by
+a moving average price (in 2 decimal places) on another line. You must follow the output format as
+shown below (please note the values are not true, which are for reference only)
+
+IV. Evaluation Criteria (40% of Overall Course Assessment)
+The project will be graded using the following criteria:
+• 15% - Correctness of program execution and output data
+• 10% - Modularization (e.g. dividing the program functionality into different functions)
+• 5% - Error handling
+• 5% - Consistent style (e.g., capitalization, indenting, etc.)
+• 5% - Appropriate comments
--- a/(2022-23)_c370323f83a52a514752fe75fceffa43.pdf
+++ b/(2022-23)_c370323f83a52a514752fe75fceffa43.pdf
--- a/2nd_copy/_ref/google.csv
+++ b/2nd_copy/_ref/google.csv
--- a/2nd_copy/_ref/steps.ods
+++ b/2nd_copy/_ref/steps.ods
--- a/2nd_copy/_ref/test.csv
+++ b/2nd_copy/_ref/test.csv
@@ -0,0 +1,15 @@
+Date,Open,High,Low,Close,Adj Close,Volume
+2008-07-04,460,463.24,449.4,450.26,450.26,4848500
+2008-07-03,468.73,474.29,459.58,464.41,464.41,4314600
+2008-06-02,476.77,482.18,461.42,465.25,465.25,6111500
+2008-06-29,469.75,471.01,462.33,463.29,463.29,3848200
+2008-05-08,452.02,452.94,417.55,419.95,419.95,9017900
+2008-05-05,445.49,452.46,440.08,444.25,444.25,4534300
+2008-04-04,460,463.24,449.4,450.26,450.26,4848500
+2008-04-03,468.73,474.29,459.58,464.41,464.41,4314600
+2008-03-02,476.77,482.18,461.42,465.25,465.25,6111500
+2008-03-29,469.75,471.01,462.33,463.29,463.29,3848200
+2008-02-28,472.49,476.45,470.33,473.78,473.78,3029700
+2008-02-27,473.73,474.83,464.84,468.58,468.58,4387100
+2008-01-28,472.49,476.45,470.33,473.78,473.78,3029700
+2008-01-27,473.73,474.83,464.84,468.58,468.58,4387100
--- a/2nd_copy/build.sh
+++ b/2nd_copy/build.sh
@@ -0,0 +1,16 @@
+#!/usr/bin/env bash
+
+rm -rf _temp/*
+rm -rf delivery2.zip
+
+mkdir -p _temp
+
+set -ex
+
+cp src/main.py _temp/XXXXXXXX_project.py
+
+pushd _temp
+  7za a -tzip ../delivery2.zip *
+popd
+
+rm -rf _temp
--- a/2nd_copy/src/Pipfile
+++ b/2nd_copy/src/Pipfile
@@ -0,0 +1,11 @@
+[[source]]
+url = "https://pypi.org/simple"
+verify_ssl = true
+name = "pypi"
+
+[packages]
+
+[dev-packages]
+
+[requires]
+python_version = "3.11"
--- a/2nd_copy/src/main.py
+++ b/2nd_copy/src/main.py
@@ -0,0 +1,262 @@
+# Objective:
+# This scripts aims to analyze the historical prices of a stock
+import os
+import sys
+import csv
+
+# define error constant
+CSV_FILE_NOT_FOUND='csv_file_not_found'
+
+# column assignment by CSV definition
+[ C_DATE, 
+  C_OPEN, 
+  C_HIGH, 
+  C_LOW, 
+  C_CLOSE, 
+  C_ADJ_CLOSE, 
+  C_VOLUME, 
+  C_MONTH_AVG_PRICE,
+  C_EMA
+  ] = list(range(0,8+1))
+
+# NOTE: get_data_list(csv_file_name)
+# NOTE: This function has one parameter, namely csv_file_name. 
+# NOTE: When the function is called, you need to pass along a CSV file name which is used inside the function to open and read the CSV
+# NOTE: file. 
+# NOTE: After reading each row, it will be split into a list. The list will then be appended into a main
+# NOTE: list (a list of lists), namely data_list. The data_list will be returned at the end of the
+# NOTE: function.
+
+# NOTE: file tested found as protected by outer try except structure
+def clean_data(data_list):
+  """clean and bloat data"""
+
+  out_list = []
+  for data in sorted(data_list):
+    out_list.append([
+      data[C_DATE],
+      float(data[C_OPEN]),
+      float(data[C_HIGH]),
+      float(data[C_LOW]),
+      float(data[C_CLOSE]),
+      float(data[C_ADJ_CLOSE]),
+      float(data[C_VOLUME]),
+    ])
+  return out_list
+
+def get_data_list(csv_file_name):
+  '''parse csv file, bloat it into list object'''
+
+  data_list = []
+  with open(csv_file_name, newline='') as f_csv:
+    data_list = list(csv.reader(f_csv, delimiter=',', quotechar='"'))
+  
+  # NOTE: skip the very first row as that is names
+  # NOTE: bloat the column accordingly
+  return clean_data(data_list[1:])
+
+# NOTE: get_monthly_averages(data_list)
+# NOTE: This function has one parameter, namely data_list. You need to pass the data_list
+# NOTE: generated by the get_data_list() function as the argument to this function and then
+# NOTE: calculate the monthly average prices of the stock. The average monthly prices are calculated in
+# NOTE: the following way. 
+# NOTE: 
+# NOTE: 1. Suppose the volume and adjusted closing price of a trading day are V1 and C1, respectively. 
+# NOTE: 2. The total sale of that day equals V1 x C1. 
+# NOTE: 3. Now, suppose the volume and adjusted closing price of another trading day are V2 and C2, respectively. 
+# NOTE: 4. The average of these two trading days is the sum of the total sales divided by the total volume:
+# NOTE: 
+# NOTE:                        Average price = (V1 x C1 + V2 x C2) / (V1 + V2)
+# NOTE: 
+# NOTE: To average a whole month, you need to 
+# NOTE:   - add up the total sales (V1 x C1 + V2 x C2 + ... + Vn x Cn) for each day and 
+# NOTE:   - divide it by the sum of all volumes (V1 + V2 + ... + Vn) where n is the number of trading days in the month.
+# NOTE: A tuple with 2 items, including the date (year and month only) and the average for that month,
+# NOTE: will be generated for each month. The tuple for each month will be appended to a main list,
+# NOTE: namely monthly_averages_list. The monthly_averages_list will be returned at the end of the function.
+
+def get_available_month(data_list):
+  '''get the unique month from the list
+  input: 
+    data_list
+  '''
+  return sorted(set([data[0][0:7] for data in data_list]))
+
+def get_monthly_averages(data_list):
+  '''get the average price by month
+  input: 
+    data_list
+  '''
+  month_in_list = get_available_month(data_list)
+  month_average_price = {}
+  monthly_averages_list = data_list
+
+  # get total volume by month
+  for month in month_in_list:
+    filtered_month_transaction = list(filter(lambda row: row[C_DATE][0:7] == month, monthly_averages_list))
+
+    # NOTE: (V1 x C1 + V2 x C2 ...)
+    sum_total_sale_by_month = sum(map(lambda row: row[C_VOLUME] * row[C_ADJ_CLOSE], filtered_month_transaction))
+
+    # NOTE: (V1 + V2 ...)
+    sum_volume_by_month = sum(map(lambda t: t[C_VOLUME], filtered_month_transaction))
+
+    # NOTE: Average price = (V1 x C1 + V2 x C2 ...) / (V1 + V2 ... )
+    month_average_price[month] = sum_total_sale_by_month/sum_volume_by_month
+
+  # NOTE: append to main list -> C_MONTH_AVG_PRICE
+  for data in monthly_averages_list:
+    data.append(month_average_price[data[C_DATE][0:7]])
+
+  return monthly_averages_list
+
+# NOTE: get_moving_averages(monthly_averages_list)
+# NOTE: This function has one parameter, namely monthly_averages_list. You need to pass the
+# NOTE: monthly_averages_list generated by get_monthly_averages() as the argument
+# NOTE: to this function and then calculate the 5-month exponential moving average (EMA) stock prices.
+# NOTE: In general, the EMA for a particular month can be calculated by the following formula:
+# NOTE: 
+# NOTE:     EMA = (Monthly average price – previous month’s EMA) x smoothing constant + previous month’s EMA
+# NOTE: 
+# NOTE: where
+# NOTE: 
+# NOTE:     smoothing constant = 2 / (number of time periods in months + 1)
+# NOTE: 
+# NOTE: Initial SMA = 20-period sum / 20
+# NOTE: Multiplier = (2 / (Time periods + 1) ) = (2 / (20 + 1) ) = 0.0952(9.52%)
+# NOTE: EMA = {Close – EMA(previous day)} x multiplier + EMA(previous day).
+def get_monthly_average(data_list, month_wanted):
+  '''
+  get monthly average from the list
+  input: 
+    data_list: data_list
+    month_wanted: YYYY-MM
+  '''
+  return list(filter(lambda d: d[C_DATE][0:7] == month_wanted, data_list) )[0][C_MONTH_AVG_PRICE]
+
+
+def get_SMA(data_list, month_to_get_SMA):
+  '''calculate SMA from the beginning(oldest) of the list
+  input:
+    data_list: data_list
+    month_to_get_SMA : number of month to initialize the SMA (i.e. 5)
+  '''
+  sum_of_months = 0
+  
+  for month in month_to_get_SMA:
+    sum_of_months = sum_of_months + get_monthly_average(data_list, month)
+
+  return sum_of_months / len(month_to_get_SMA)
+
+def get_extreme_EMA(ema_list, max_min= 'min', skip_month=0):
+  '''get max/min EMA from the list
+  input:
+    ema_list: month list with ema
+    max_min: max / min selector (default: min)
+    skip_month: month to skip as initialized as SMA (i.e. the first 5 month)
+  '''
+  if (max_min == 'max'):
+    return max(map(lambda r: r[2], ema_list[skip_month:]))
+
+  return min(map(lambda r: r[2], ema_list[skip_month:]))
+
+def get_month_by_EMA(ema_list, ema_value):
+  '''get months(value) specified by the EMA value wanted
+  input:
+    ema_list: month list with ema
+    ema_value: ema value to select the month (i.e. max EMA)
+  '''
+  return list(map(lambda r: r[0], filter(lambda x: x[2] == ema_value, ema_list)))
+
+def get_output_content(max_ema, min_ema, max_ema_months, min_ema_months, report_name=""):
+  '''get the output content, return with a formatted string
+  input:
+    max_ema: max ema to report
+    min_ema: min ema to report
+    max_ema_months: month(s) to report with max ema
+    min_ema_months: month(s) to report with min ema
+  '''
+  # reformat to MM-YYYY before out to file
+  reformat_max_ema_months = list(map(lambda m: m.split('-')[1]+'-'+m.split('-')[0] , max_ema_months))
+  reformat_min_ema_months = list(map(lambda m: m.split('-')[1]+'-'+m.split('-')[0] , min_ema_months))
+
+  return '''
+# The best month for {report_name}:
+# {best_ema_months}, {best_EMA}
+
+# The worst month for {report_name}:
+# {worst_ema_months}, {worst_EMA}
+  '''.format(
+    best_ema_months=','.join(reformat_max_ema_months), 
+    best_EMA=round(max_ema, 2), 
+    worst_ema_months=','.join(reformat_min_ema_months), 
+    worst_EMA=round(min_ema, 2),
+    report_name=report_name).strip()
+  
+
+def get_moving_averages(monthly_averages_list):
+  '''get moving averages
+  input:
+    monthly_averages_list
+  '''
+  month_available = get_available_month(monthly_averages_list)
+  # NOTE: initialize first 0 to 4 SMA
+  monthly_averages_list_w_EMA = [[c, get_monthly_average(monthly_averages_list, c)] for c in month_available]
+  initial_SMA = sum(map(lambda x: x[1], monthly_averages_list_w_EMA[0:5]))/5
+  
+  smoothing_constant = 2 / (5 + 1)
+
+  for i in range(0,len(monthly_averages_list_w_EMA)):
+    if (i < 5):
+      # first 5 month were given by SMA
+      monthly_averages_list_w_EMA[i].append( initial_SMA)
+
+    else:
+      month_average_this_month = monthly_averages_list_w_EMA[i][1]
+      EMA_last_month = monthly_averages_list_w_EMA[i-1][2]
+      EMA_this_month = (month_average_this_month - EMA_last_month) * smoothing_constant + EMA_last_month
+
+      monthly_averages_list_w_EMA[i].append( EMA_this_month  )
+
+  return monthly_averages_list_w_EMA
+
+# get input from user
+csv_filepath = input("Please input a csv filename: ")
+
+try:
+  # NOTE: get csv file from user
+  csv_filename = csv_filepath
+  txt_filename = csv_filename.split('.csv')[0]+'_output.txt'
+  report_name = os.path.basename(csv_filename).replace('.csv','')
+  
+  # NOTE: process file
+  data_list = get_data_list(csv_filename)
+  monthly_average_list = get_monthly_averages(data_list)
+  ema_list = get_moving_averages(monthly_average_list)
+
+  # NOTE: output txt file
+  max_ema = get_extreme_EMA(ema_list,'max', 5)
+  min_ema = get_extreme_EMA(ema_list, 'min',5)
+  best_ema_months = get_month_by_EMA(ema_list, max_ema)
+  worst_ema_months = get_month_by_EMA(ema_list, min_ema)
+
+  output_string = get_output_content(max_ema, min_ema, best_ema_months, worst_ema_months, report_name)
+
+  with open(txt_filename, 'w+') as f_output:
+    f_output.truncate(0)
+    f_output.writelines(output_string)
+
+  print('output wrote '+txt_filename)
+  print('done !')
+
+except IsADirectoryError as e:
+  # NOTE: if input is a directory, drop here
+    print('sorry the path is a directory')
+
+except FileNotFoundError as e:
+  # NOTE: if csv file not found, drop here
+  print('sorry cannot find the file wanted')
+
+except Exception as e:
+  #   # cast outside if exception definition not found
+  raise e
--- a/2nd_copy/src/test.sh
+++ b/2nd_copy/src/test.sh
@@ -0,0 +1,6 @@
+#!/usr/bin/env bash
+
+set -ex
+
+clear
+python3 ./main.py