update,

2025-02-01 02:02:14 +08:00
parent a767348238
commit c403fa8e72
48 changed files with 5987 additions and 0 deletions
--- a/1st_copy/NOTES.md
+++ b/1st_copy/NOTES.md
@@ -0,0 +1,114 @@
+### objective
+
+env: Windows 
+deadline 23/12
+
+
+### CAUTION
+Do not include any code not written by you in your
+project. You are NOT allowed to import any Python libraries in your solution except the
+modules namely os (https://docs.python.org/3/library/os.html), sys
+(https://docs.python.org/3/library/sys.html) and csv (https://docs.python.org/3/library/csv.html). If
+cheating is found or the import requirement is violated, you will receive a zero mark.
+
+
+### Deliverable
+You have to include your student name and ID in your source code and name your project solution as
+“XXXXXXXX_project.py” (where XXXXXXXX is your 8-digit student ID). Please remember to
+upload your source code solution to Moodle by the submission deadline.
+
+
+### drill down
+
+Functions
+Given the file of stock prices, you are asked to develop a Python program to process the data by
+designing appropriate functions. At minimum you need to implement and call the following three
+functions:
+• get_data_list(csv_file_name)
+This function has one parameter, namely csv_file_name. When the function is called, you
+need to pass along a CSV file name which is used inside the function to open and read the CSV
+file. After reading each row, it will be split into a list. The list will then be appended into a main
+list (a list of lists), namely data_list. The data_list will be returned at the end of the
+function.
+2
+• get_monthly_averages(data_list)
+This function has one parameter, namely data_list. You need to pass the data_list
+generated by the get_data_list() function as the argument to this function and then
+calculate the monthly average prices of the stock. The average monthly prices are calculated in
+the following way. Suppose the volume and adjusted closing price of a trading day are V1 and C1,
+respectively. The total sale of that day equals V1 x C1. Now, suppose the volume and adjusted
+closing price of another trading day are V2 and C2, respectively. The average of these two trading
+days is the sum of the total sales divided by the total volume:
+Average price = (V1 x C1 + V2 x C2) / (V1 + V2)
+To average a whole month, you need to add up the total sales (V1 x C1 + V2 x C2 + ... +
+Vn x Cn) for each day and divide it by the sum of all volumes (V1 + V2 + ... + Vn) where n
+is the number of trading days in the month.
+A tuple with 2 items, including the date (year and month only) and the average for that month,
+will be generated for each month. The tuple for each month will be appended to a main list,
+namely monthly_averages_list. The monthly_averages_list will be returned at
+the end of the function.
+• get_moving_averages(monthly_averages_list)
+This function has one parameter, namely monthly_averages_list. You need to pass the
+monthly_averages_list generated by get_monthly_averages() as the argument
+to this function and then calculate the 5-month exponential moving average (EMA) stock prices.
+In general, the EMA for a particular month can be calculated by the following formula:
+EMA = (Monthly average price – previous month’s EMA) x smoothing constant
+ previous month’s EMA
+where
+smoothing constant = 2 / (number of time periods in months + 1)
+3
+For example, the following table shows the stock prices between Oct 2020 and Apr 2021:
+Month Monthly Average Price
+Oct 2020 14
+Nov 2020 13
+Dec 2020 14
+Jan 2021 12
+Feb 2021 13
+Mar 2021 12
+Apr 2021 11
+The initial 5-month EMA for Feb 2021 can be calculated by the simple average formula, as
+shown below:
+5-month EMA for Feb 2021 = (14 + 13 + 14 + 12 + 13) / 5 = 13.2
+The 5-month EMA for Mar 2021 can be calculated by the EMA formula, as shown below:
+5-month EMA for Mar 2021 = (Monthly average price – previous month’s EMA) x
+smoothing constant + previous month’s EMA
+= (12 – 13.2) x (2 / 6) + 13.2
+= 12.8
+The 5-month EMA for Apr 2021 can be calculated by the EMA formula, as shown below:
+5-month EMA for Apr 2021 = (Monthly average price – previous month’s EMA) x
+smoothing constant + previous month’s EMA
+= (11 – 12.8) x (2 / 6) + 12.8
+= 12.2
+The resulting 5-month EMA stock prices are shown below:
+Month Average Price 5-month EMA Price
+Oct 2020 14 -
+Nov 2020 13 -
+Dec 2020 14 -
+Jan 2021 12 -
+Feb 2021 13 13.2
+Mar 2021 12 12.8
+Apr 2021 11 12.2
+4
+A tuple with 2 items, including the date (year and month only) and the 5-month EMA price for
+that month, will be generated for each month except the first 4 months. Each tuple will be
+appended to a main list, namely moving_averages_list. The
+moving_averages_list will be returned at the end of the function.
+
+
+Program Input and Output
+At the outset, your program needs to ask the user for a CSV file name:
+Based on the entered CSV file name, a corresponding output text file (e.g. “Google_output.txt”
+for this case) will be generated. In the output file, you are eventually required to print the best month
+(with the highest EMA price) and the worst month (with the lowest EMA price) for the stock. You
+need to first print a header line for the stock, and then print a date (MM-YYYY), a comma followed by
+a moving average price (in 2 decimal places) on another line. You must follow the output format as
+shown below (please note the values are not true, which are for reference only)
+
+
+IV. Evaluation Criteria (40% of Overall Course Assessment)
+The project will be graded using the following criteria:
+• 15% - Correctness of program execution and output data
+• 10% - Modularization (e.g. dividing the program functionality into different functions)
+• 5% - Error handling
+• 5% - Consistent style (e.g., capitalization, indenting, etc.)
+• 5% - Appropriate comments
--- a/(2022-23)_c370323f83a52a514752fe75fceffa43.pdf
+++ b/(2022-23)_c370323f83a52a514752fe75fceffa43.pdf
--- a/1st_copy/_ref/google.csv
+++ b/1st_copy/_ref/google.csv
--- a/1st_copy/_ref/steps.ods
+++ b/1st_copy/_ref/steps.ods
--- a/1st_copy/_ref/test.csv
+++ b/1st_copy/_ref/test.csv
@@ -0,0 +1,15 @@
+Date,Open,High,Low,Close,Adj Close,Volume
+2008-07-04,460,463.24,449.4,450.26,450.26,4848500
+2008-07-03,468.73,474.29,459.58,464.41,464.41,4314600
+2008-06-02,476.77,482.18,461.42,465.25,465.25,6111500
+2008-06-29,469.75,471.01,462.33,463.29,463.29,3848200
+2008-05-08,452.02,452.94,417.55,419.95,419.95,9017900
+2008-05-05,445.49,452.46,440.08,444.25,444.25,4534300
+2008-04-04,460,463.24,449.4,450.26,450.26,4848500
+2008-04-03,468.73,474.29,459.58,464.41,464.41,4314600
+2008-03-02,476.77,482.18,461.42,465.25,465.25,6111500
+2008-03-29,469.75,471.01,462.33,463.29,463.29,3848200
+2008-02-28,472.49,476.45,470.33,473.78,473.78,3029700
+2008-02-27,473.73,474.83,464.84,468.58,468.58,4387100
+2008-01-28,472.49,476.45,470.33,473.78,473.78,3029700
+2008-01-27,473.73,474.83,464.84,468.58,468.58,4387100
--- a/1st_copy/build.sh
+++ b/1st_copy/build.sh
@@ -0,0 +1,16 @@
+#!/usr/bin/env bash
+
+rm -rf _temp/*
+rm -rf delivery.zip
+
+mkdir -p _temp
+
+set -ex
+
+cp src/main.py _temp/XXXXXXXX_project.py
+
+pushd _temp
+  7za a -tzip ../delivery1.zip *
+popd
+
+rm -rf _temp
--- a/1st_copy/src/Pipfile
+++ b/1st_copy/src/Pipfile
@@ -0,0 +1,11 @@
+[[source]]
+url = "https://pypi.org/simple"
+verify_ssl = true
+name = "pypi"
+
+[packages]
+
+[dev-packages]
+
+[requires]
+python_version = "3.11"
--- a/1st_copy/src/main.py
+++ b/1st_copy/src/main.py
@@ -0,0 +1,243 @@
+#!/usr/bin/env python3
+
+# Do not include any code not written by you in your project. 
+# You are NOT allowed to import any Python libraries in your solution except the modules namely 
+  # - os (https://docs.python.org/3/library/os.html), 
+  # - sys (https://docs.python.org/3/library/sys.html) and 
+  # - csv (https://docs.python.org/3/library/csv.html). 
+# 
+# If cheating is found or the import requirement is violated, you will receive a zero mark.
+
+import os,sys, csv
+
+# column from csv file
+# COL_DATE: the day of trading
+# COL_OPEN: the stock price at the beginning of the trading day
+# COL_HIGH: the highest price the stock achieved on the trading day
+# COL_LOW: the lowest price the stock achieved on the trading day
+# COL_CLOSE: the stock price at the end of the trading day
+# COL_ADJ_Close: the adjusted closing price of the trading day (reflecting the stock’s value after accounting for any corporate actions like dividends, stock splits and new stock offerings)
+# COL_VOLUME: the total number of shares were traded on the trading day
+COL_DATE=0
+COL_OPEN=1
+COL_HIGH=2
+COL_LOW=3
+COL_CLOSE=4
+COL_ADJ_CLOSE=5
+COL_VOLUME=6
+
+# append at middle stage
+COL_TOTAL_SALE_OF_DAY=7
+COL_MONTH_ONLY=8
+COL_EMA=9
+
+# monthly_averages_list
+COL_MONTHLY_AVERAGE_PRICE=1
+COL_EMA=2
+
+
+# get_data_list(csv_file_name)
+# This function has one parameter, namely csv_file_name. 
+# When the function is called, you need to pass along a CSV file name which is used inside the function to open and read the CSV
+# file. 
+# After reading each row, it will be split into a list. The list will then be appended into a main
+# list (a list of lists), namely data_list. The data_list will be returned at the end of the
+# function.
+def get_data_list(csv_file_name):
+  '''read data list from csv file'''
+  data_list = []
+  try:
+    with open(csv_file_name, newline='') as csvfile:
+      temp = []
+      temp = csv.reader(csvfile, delimiter=',', quotechar='"')
+      data_list = list(temp)
+    
+    return data_list
+  except Exception as e:
+    print('error during reading csv file ')
+    print('exitting...')
+    sys.exit()
+
+
+# get_monthly_averages(data_list)
+# This function has one parameter, namely data_list. You need to pass the data_list
+# generated by the get_data_list() function as the argument to this function and then
+# calculate the monthly average prices of the stock. The average monthly prices are calculated in
+# the following way. 
+# 
+# 1. Suppose the volume and adjusted closing price of a trading day are V1 and C1, respectively. 
+# 2. The total sale of that day equals V1 x C1. 
+# 3. Now, suppose the volume and adjusted closing price of another trading day are V2 and C2, respectively. 
+# 4. The average of these two trading days is the sum of the total sales divided by the total volume:
+# 
+#                        Average price = (V1 x C1 + V2 x C2) / (V1 + V2)
+# 
+# To average a whole month, you need to 
+#   - add up the total sales (V1 x C1 + V2 x C2 + ... + Vn x Cn) for each day and 
+#   - divide it by the sum of all volumes (V1 + V2 + ... + Vn) where n is the number of trading days in the month.
+# A tuple with 2 items, including the date (year and month only) and the average for that month,
+# will be generated for each month. The tuple for each month will be appended to a main list,
+# namely monthly_averages_list. The monthly_averages_list will be returned at the end of the function.
+
+def get_monthly_averages(data_list):
+  '''calculate the monthly average prices of the stock'''
+
+  monthly_averages_list=[]
+  data_list_data_only = data_list[1:]
+  month_available = []
+  
+  # data cleaning
+  for i in range(len(data_list_data_only)):
+    # V1 x C1, calculate the total sale, append into column
+    data_list_data_only[i].append(float(data_list_data_only[i][COL_VOLUME]) * float(data_list_data_only[i][COL_ADJ_CLOSE]))
+
+    # mark the row by YYYY-MM for easy monthly sum calculation, COL_MONTH_ONLY
+    data_list_data_only[i].append(data_list_data_only[i][COL_DATE][0:7])
+
+  # get the month in the list YYYY-MM
+  month_available = set(list(map(lambda x: x[COL_MONTH_ONLY], data_list_data_only)))
+
+  # literate the whole list, calculate the total_sale and total volume
+  # get the average sale by total_sale / total_volume
+  for month in sorted(month_available):
+    filtered_month = list(filter(lambda x: x[COL_MONTH_ONLY] == month, data_list_data_only))
+    total_sale = sum(list( map(lambda x: x[COL_TOTAL_SALE_OF_DAY], filtered_month)))
+    total_volume = sum(list( map(lambda x: float(x[COL_VOLUME]), filtered_month)))
+    monthly_averages_list.append([month, total_sale/total_volume])
+
+  return list(monthly_averages_list)
+
+# get_moving_averages(monthly_averages_list)
+# This function has one parameter, namely monthly_averages_list. You need to pass the
+# monthly_averages_list generated by get_monthly_averages() as the argument
+# to this function and then calculate the 5-month exponential moving average (EMA) stock prices.
+# In general, the EMA for a particular month can be calculated by the following formula:
+# 
+#     EMA = (Monthly average price – previous month’s EMA) x smoothing constant + previous month’s EMA
+# 
+# where
+# 
+#     smoothing constant = 2 / (number of time periods in months + 1)
+# 
+# Initial SMA = 20-period sum / 20
+# Multiplier = (2 / (Time periods + 1) ) = (2 / (20 + 1) ) = 0.0952(9.52%)
+# EMA = {Close – EMA(previous day)} x multiplier + EMA(previous day).
+def get_moving_averages(monthly_averages_list):
+  '''
+    get moving averages from montyly_average_list
+    input:
+    [ [YYYY-MM, monthly average price],
+      [YYYY-MM, monthly average price],
+      ...]
+
+    output: 
+    [ [YYYY-MM, monthly average price, EMA],
+      [YYYY-MM, monthly average price, EMA],
+      ...]
+  '''
+
+  # by ref, the first 5 month EMA were given by SMA
+  monthly_averages_list[0].append(sum(map(lambda x: x[1], monthly_averages_list[0:5]))/5)
+  monthly_averages_list[1].append(sum(map(lambda x: x[1], monthly_averages_list[0:5]))/5)
+  monthly_averages_list[2].append(sum(map(lambda x: x[1], monthly_averages_list[0:5]))/5)
+  monthly_averages_list[3].append(sum(map(lambda x: x[1], monthly_averages_list[0:5]))/5)
+  monthly_averages_list[4].append(sum(map(lambda x: x[1], monthly_averages_list[0:5]))/5)
+
+  # smoothing constant = 2 / (number of time periods in months + 1)
+  smoothing_constant = 2 / (5 + 1)
+
+  # main loop to calculate EMA, start from the 6th month available till the end of the list
+  for i in range(5, len(monthly_averages_list)):
+    previous_month_EMA = monthly_averages_list[i-1][2]
+    Monthly_average_price = monthly_averages_list[i][1]
+
+    EMA = (Monthly_average_price - previous_month_EMA) * smoothing_constant + previous_month_EMA
+    monthly_averages_list[i].append(EMA)
+
+  return monthly_averages_list
+
+# Based on the entered CSV file name, a corresponding output text file (e.g. “Google_output.txt” for this case)  will be generated. 
+# In the output file, you are eventually required to print:
+#   - the best month (with the highest EMA price) and 
+#   - the worst month (with the lowest EMA price) for the stock. 
+# You need to first print a header line for the stock, and then print a date (MM-YYYY),
+# a comma followed by a moving average price (in 2 decimal places) on another line. 
+def format_date_string(yyyy_mm):
+  '''rearrange date string from csv file YYYY-MM => MM-YYYY'''
+  [yyyy, mm] = yyyy_mm.split('-')
+  return '-'.join([mm, yyyy])
+
+def write_output_file(filename_to_write, monthly_averages_list_w_ema, report_name):
+  '''get output string from template and write to output file
+  input:
+    filename_to_write: txt file name with path to be written to
+    monthly_averages_list_w_ema: list provided with EMA
+    report_name: report name to be written to report
+  '''
+
+  RESULT_TEMPLATE='''
+# The best month for ^report_name^:
+# ^best_month^, ^best_EMA^
+
+# The worst month for ^report_name^:
+# ^worst_month^, ^worst_EMA^
+  '''.strip()
+
+  # get the max EMA of the list
+  best_EMA = max(map(lambda x: x[2], monthly_averages_list_w_ema[5:]))
+  # get the month(s) by the EMA wanted
+  best_months = list(map(lambda x: format_date_string(x[0]), filter(lambda x: x[2] == best_EMA, monthly_averages_list_w_ema[5:])))
+
+  # get the min(worst) EMA of the list
+  worst_EMA = min(map(lambda x: x[2], monthly_averages_list_w_ema[5:]))
+  # get the month(s) by the EMA wanted
+  worst_months = list(map(lambda x: format_date_string(x[0]), filter(lambda x: x[2] == worst_EMA, monthly_averages_list_w_ema[5:])))
+
+  # assemble the output string
+  result_string = RESULT_TEMPLATE
+  result_string = result_string\
+    .replace('^best_month^', ','.join(best_months))\
+    .replace('^best_EMA^', str('%.2f' % best_EMA))\
+    .replace('^worst_month^', ','.join(worst_months))\
+    .replace('^worst_EMA^', str('%.2f' % worst_EMA)) \
+    .replace('^report_name^', report_name) 
+
+  # write output file
+  with open(filename_to_write, 'w+') as file_write:
+    file_write.truncate(0)
+    file_write.writelines(result_string)
+
+def main():
+    # Main function starts here
+
+    print('start')
+
+    # gather csv file with path from user
+    input_filename = input("Please input a csv filename: ")
+    
+    csv_filename = os.path.basename(input_filename)
+    csv_path = os.path.dirname(input_filename)
+
+    # transform to the output file path by csv file name got
+    txt_filename = csv_filename.replace('.csv','_output.txt')
+    if (csv_path != ''):
+      txt_filename = '/'.join([csv_path, txt_filename])
+    else:
+      # by default keep into current directory
+      txt_filename = '/'.join(['.', txt_filename])
+    
+    # grep the corp_name from the filename google.csv => google
+    corp_name = os.path.basename(input_filename).split('.')[0]
+
+    # process the data_list by csv file as stateed in assignment
+    print(f'processing {csv_filename}')
+    csv_list=get_data_list(input_filename)
+    monthly_averages_list = get_monthly_averages(csv_list)
+    monthly_averages_list_w_EMA = get_moving_averages(monthly_averages_list)
+
+    # write output file
+    write_output_file(txt_filename, monthly_averages_list_w_EMA, corp_name)
+    print('wrote to {file} done'.format(file = txt_filename))
+
+if __name__ == "__main__":
+    main()
--- a/1st_copy/src/test.sh
+++ b/1st_copy/src/test.sh
@@ -0,0 +1,6 @@
+#!/usr/bin/env bash
+
+set -ex
+
+
+python3 ./main.py