update,
This commit is contained in:
112
2nd_copy/NOTES.md
Normal file
112
2nd_copy/NOTES.md
Normal file
@@ -0,0 +1,112 @@
|
||||
### objective
|
||||
|
||||
env: Windows
|
||||
deadline 23/12
|
||||
|
||||
### CAUTION
|
||||
|
||||
Do not include any code not written by you in your
|
||||
project. You are NOT allowed to import any Python libraries in your solution except the
|
||||
modules namely os (https://docs.python.org/3/library/os.html), sys
|
||||
(https://docs.python.org/3/library/sys.html) and csv (https://docs.python.org/3/library/csv.html). If
|
||||
cheating is found or the import requirement is violated, you will receive a zero mark.
|
||||
|
||||
### Deliverable
|
||||
|
||||
You have to include your student name and ID in your source code and name your project solution as
|
||||
“XXXXXXXX_project.py” (where XXXXXXXX is your 8-digit student ID). Please remember to
|
||||
upload your source code solution to Moodle by the submission deadline.
|
||||
|
||||
### drill down
|
||||
|
||||
Functions
|
||||
Given the file of stock prices, you are asked to develop a Python program to process the data by
|
||||
designing appropriate functions. At minimum you need to implement and call the following three
|
||||
functions:
|
||||
• get_data_list(csv_file_name)
|
||||
This function has one parameter, namely csv_file_name. When the function is called, you
|
||||
need to pass along a CSV file name which is used inside the function to open and read the CSV
|
||||
file. After reading each row, it will be split into a list. The list will then be appended into a main
|
||||
list (a list of lists), namely data_list. The data_list will be returned at the end of the
|
||||
function.
|
||||
2
|
||||
• get_monthly_averages(data_list)
|
||||
This function has one parameter, namely data_list. You need to pass the data_list
|
||||
generated by the get_data_list() function as the argument to this function and then
|
||||
calculate the monthly average prices of the stock. The average monthly prices are calculated in
|
||||
the following way. Suppose the volume and adjusted closing price of a trading day are V1 and C1,
|
||||
respectively. The total sale of that day equals V1 x C1. Now, suppose the volume and adjusted
|
||||
closing price of another trading day are V2 and C2, respectively. The average of these two trading
|
||||
days is the sum of the total sales divided by the total volume:
|
||||
Average price = (V1 x C1 + V2 x C2) / (V1 + V2)
|
||||
To average a whole month, you need to add up the total sales (V1 x C1 + V2 x C2 + ... +
|
||||
Vn x Cn) for each day and divide it by the sum of all volumes (V1 + V2 + ... + Vn) where n
|
||||
is the number of trading days in the month.
|
||||
A tuple with 2 items, including the date (year and month only) and the average for that month,
|
||||
will be generated for each month. The tuple for each month will be appended to a main list,
|
||||
namely monthly_averages_list. The monthly_averages_list will be returned at
|
||||
the end of the function.
|
||||
• get_moving_averages(monthly_averages_list)
|
||||
This function has one parameter, namely monthly_averages_list. You need to pass the
|
||||
monthly_averages_list generated by get_monthly_averages() as the argument
|
||||
to this function and then calculate the 5-month exponential moving average (EMA) stock prices.
|
||||
In general, the EMA for a particular month can be calculated by the following formula:
|
||||
EMA = (Monthly average price – previous month’s EMA) x smoothing constant
|
||||
|
||||
- previous month’s EMA
|
||||
where
|
||||
smoothing constant = 2 / (number of time periods in months + 1)
|
||||
3
|
||||
For example, the following table shows the stock prices between Oct 2020 and Apr 2021:
|
||||
Month Monthly Average Price
|
||||
Oct 2020 14
|
||||
Nov 2020 13
|
||||
Dec 2020 14
|
||||
Jan 2021 12
|
||||
Feb 2021 13
|
||||
Mar 2021 12
|
||||
Apr 2021 11
|
||||
The initial 5-month EMA for Feb 2021 can be calculated by the simple average formula, as
|
||||
shown below:
|
||||
5-month EMA for Feb 2021 = (14 + 13 + 14 + 12 + 13) / 5 = 13.2
|
||||
The 5-month EMA for Mar 2021 can be calculated by the EMA formula, as shown below:
|
||||
5-month EMA for Mar 2021 = (Monthly average price – previous month’s EMA) x
|
||||
smoothing constant + previous month’s EMA
|
||||
= (12 – 13.2) x (2 / 6) + 13.2
|
||||
= 12.8
|
||||
The 5-month EMA for Apr 2021 can be calculated by the EMA formula, as shown below:
|
||||
5-month EMA for Apr 2021 = (Monthly average price – previous month’s EMA) x
|
||||
smoothing constant + previous month’s EMA
|
||||
= (11 – 12.8) x (2 / 6) + 12.8
|
||||
= 12.2
|
||||
The resulting 5-month EMA stock prices are shown below:
|
||||
Month Average Price 5-month EMA Price
|
||||
Oct 2020 14 -
|
||||
Nov 2020 13 -
|
||||
Dec 2020 14 -
|
||||
Jan 2021 12 -
|
||||
Feb 2021 13 13.2
|
||||
Mar 2021 12 12.8
|
||||
Apr 2021 11 12.2
|
||||
4
|
||||
A tuple with 2 items, including the date (year and month only) and the 5-month EMA price for
|
||||
that month, will be generated for each month except the first 4 months. Each tuple will be
|
||||
appended to a main list, namely moving_averages_list. The
|
||||
moving_averages_list will be returned at the end of the function.
|
||||
|
||||
Program Input and Output
|
||||
At the outset, your program needs to ask the user for a CSV file name:
|
||||
Based on the entered CSV file name, a corresponding output text file (e.g. “Google_output.txt”
|
||||
for this case) will be generated. In the output file, you are eventually required to print the best month
|
||||
(with the highest EMA price) and the worst month (with the lowest EMA price) for the stock. You
|
||||
need to first print a header line for the stock, and then print a date (MM-YYYY), a comma followed by
|
||||
a moving average price (in 2 decimal places) on another line. You must follow the output format as
|
||||
shown below (please note the values are not true, which are for reference only)
|
||||
|
||||
IV. Evaluation Criteria (40% of Overall Course Assessment)
|
||||
The project will be graded using the following criteria:
|
||||
• 15% - Correctness of program execution and output data
|
||||
• 10% - Modularization (e.g. dividing the program functionality into different functions)
|
||||
• 5% - Error handling
|
||||
• 5% - Consistent style (e.g., capitalization, indenting, etc.)
|
||||
• 5% - Appropriate comments
|
Binary file not shown.
1031
2nd_copy/_ref/google.csv
Normal file
1031
2nd_copy/_ref/google.csv
Normal file
File diff suppressed because it is too large
Load Diff
BIN
2nd_copy/_ref/steps.ods
Normal file
BIN
2nd_copy/_ref/steps.ods
Normal file
Binary file not shown.
15
2nd_copy/_ref/test.csv
Normal file
15
2nd_copy/_ref/test.csv
Normal file
@@ -0,0 +1,15 @@
|
||||
Date,Open,High,Low,Close,Adj Close,Volume
|
||||
2008-07-04,460,463.24,449.4,450.26,450.26,4848500
|
||||
2008-07-03,468.73,474.29,459.58,464.41,464.41,4314600
|
||||
2008-06-02,476.77,482.18,461.42,465.25,465.25,6111500
|
||||
2008-06-29,469.75,471.01,462.33,463.29,463.29,3848200
|
||||
2008-05-08,452.02,452.94,417.55,419.95,419.95,9017900
|
||||
2008-05-05,445.49,452.46,440.08,444.25,444.25,4534300
|
||||
2008-04-04,460,463.24,449.4,450.26,450.26,4848500
|
||||
2008-04-03,468.73,474.29,459.58,464.41,464.41,4314600
|
||||
2008-03-02,476.77,482.18,461.42,465.25,465.25,6111500
|
||||
2008-03-29,469.75,471.01,462.33,463.29,463.29,3848200
|
||||
2008-02-28,472.49,476.45,470.33,473.78,473.78,3029700
|
||||
2008-02-27,473.73,474.83,464.84,468.58,468.58,4387100
|
||||
2008-01-28,472.49,476.45,470.33,473.78,473.78,3029700
|
||||
2008-01-27,473.73,474.83,464.84,468.58,468.58,4387100
|
|
16
2nd_copy/build.sh
Normal file
16
2nd_copy/build.sh
Normal file
@@ -0,0 +1,16 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
rm -rf _temp/*
|
||||
rm -rf delivery2.zip
|
||||
|
||||
mkdir -p _temp
|
||||
|
||||
set -ex
|
||||
|
||||
cp src/main.py _temp/XXXXXXXX_project.py
|
||||
|
||||
pushd _temp
|
||||
7za a -tzip ../delivery2.zip *
|
||||
popd
|
||||
|
||||
rm -rf _temp
|
11
2nd_copy/src/Pipfile
Normal file
11
2nd_copy/src/Pipfile
Normal file
@@ -0,0 +1,11 @@
|
||||
[[source]]
|
||||
url = "https://pypi.org/simple"
|
||||
verify_ssl = true
|
||||
name = "pypi"
|
||||
|
||||
[packages]
|
||||
|
||||
[dev-packages]
|
||||
|
||||
[requires]
|
||||
python_version = "3.11"
|
262
2nd_copy/src/main.py
Normal file
262
2nd_copy/src/main.py
Normal file
@@ -0,0 +1,262 @@
|
||||
# Objective:
|
||||
# This scripts aims to analyze the historical prices of a stock
|
||||
import os
|
||||
import sys
|
||||
import csv
|
||||
|
||||
# define error constant
|
||||
CSV_FILE_NOT_FOUND='csv_file_not_found'
|
||||
|
||||
# column assignment by CSV definition
|
||||
[ C_DATE,
|
||||
C_OPEN,
|
||||
C_HIGH,
|
||||
C_LOW,
|
||||
C_CLOSE,
|
||||
C_ADJ_CLOSE,
|
||||
C_VOLUME,
|
||||
C_MONTH_AVG_PRICE,
|
||||
C_EMA
|
||||
] = list(range(0,8+1))
|
||||
|
||||
# NOTE: get_data_list(csv_file_name)
|
||||
# NOTE: This function has one parameter, namely csv_file_name.
|
||||
# NOTE: When the function is called, you need to pass along a CSV file name which is used inside the function to open and read the CSV
|
||||
# NOTE: file.
|
||||
# NOTE: After reading each row, it will be split into a list. The list will then be appended into a main
|
||||
# NOTE: list (a list of lists), namely data_list. The data_list will be returned at the end of the
|
||||
# NOTE: function.
|
||||
|
||||
# NOTE: file tested found as protected by outer try except structure
|
||||
def clean_data(data_list):
|
||||
"""clean and bloat data"""
|
||||
|
||||
out_list = []
|
||||
for data in sorted(data_list):
|
||||
out_list.append([
|
||||
data[C_DATE],
|
||||
float(data[C_OPEN]),
|
||||
float(data[C_HIGH]),
|
||||
float(data[C_LOW]),
|
||||
float(data[C_CLOSE]),
|
||||
float(data[C_ADJ_CLOSE]),
|
||||
float(data[C_VOLUME]),
|
||||
])
|
||||
return out_list
|
||||
|
||||
def get_data_list(csv_file_name):
|
||||
'''parse csv file, bloat it into list object'''
|
||||
|
||||
data_list = []
|
||||
with open(csv_file_name, newline='') as f_csv:
|
||||
data_list = list(csv.reader(f_csv, delimiter=',', quotechar='"'))
|
||||
|
||||
# NOTE: skip the very first row as that is names
|
||||
# NOTE: bloat the column accordingly
|
||||
return clean_data(data_list[1:])
|
||||
|
||||
# NOTE: get_monthly_averages(data_list)
|
||||
# NOTE: This function has one parameter, namely data_list. You need to pass the data_list
|
||||
# NOTE: generated by the get_data_list() function as the argument to this function and then
|
||||
# NOTE: calculate the monthly average prices of the stock. The average monthly prices are calculated in
|
||||
# NOTE: the following way.
|
||||
# NOTE:
|
||||
# NOTE: 1. Suppose the volume and adjusted closing price of a trading day are V1 and C1, respectively.
|
||||
# NOTE: 2. The total sale of that day equals V1 x C1.
|
||||
# NOTE: 3. Now, suppose the volume and adjusted closing price of another trading day are V2 and C2, respectively.
|
||||
# NOTE: 4. The average of these two trading days is the sum of the total sales divided by the total volume:
|
||||
# NOTE:
|
||||
# NOTE: Average price = (V1 x C1 + V2 x C2) / (V1 + V2)
|
||||
# NOTE:
|
||||
# NOTE: To average a whole month, you need to
|
||||
# NOTE: - add up the total sales (V1 x C1 + V2 x C2 + ... + Vn x Cn) for each day and
|
||||
# NOTE: - divide it by the sum of all volumes (V1 + V2 + ... + Vn) where n is the number of trading days in the month.
|
||||
# NOTE: A tuple with 2 items, including the date (year and month only) and the average for that month,
|
||||
# NOTE: will be generated for each month. The tuple for each month will be appended to a main list,
|
||||
# NOTE: namely monthly_averages_list. The monthly_averages_list will be returned at the end of the function.
|
||||
|
||||
def get_available_month(data_list):
|
||||
'''get the unique month from the list
|
||||
input:
|
||||
data_list
|
||||
'''
|
||||
return sorted(set([data[0][0:7] for data in data_list]))
|
||||
|
||||
def get_monthly_averages(data_list):
|
||||
'''get the average price by month
|
||||
input:
|
||||
data_list
|
||||
'''
|
||||
month_in_list = get_available_month(data_list)
|
||||
month_average_price = {}
|
||||
monthly_averages_list = data_list
|
||||
|
||||
# get total volume by month
|
||||
for month in month_in_list:
|
||||
filtered_month_transaction = list(filter(lambda row: row[C_DATE][0:7] == month, monthly_averages_list))
|
||||
|
||||
# NOTE: (V1 x C1 + V2 x C2 ...)
|
||||
sum_total_sale_by_month = sum(map(lambda row: row[C_VOLUME] * row[C_ADJ_CLOSE], filtered_month_transaction))
|
||||
|
||||
# NOTE: (V1 + V2 ...)
|
||||
sum_volume_by_month = sum(map(lambda t: t[C_VOLUME], filtered_month_transaction))
|
||||
|
||||
# NOTE: Average price = (V1 x C1 + V2 x C2 ...) / (V1 + V2 ... )
|
||||
month_average_price[month] = sum_total_sale_by_month/sum_volume_by_month
|
||||
|
||||
# NOTE: append to main list -> C_MONTH_AVG_PRICE
|
||||
for data in monthly_averages_list:
|
||||
data.append(month_average_price[data[C_DATE][0:7]])
|
||||
|
||||
return monthly_averages_list
|
||||
|
||||
# NOTE: get_moving_averages(monthly_averages_list)
|
||||
# NOTE: This function has one parameter, namely monthly_averages_list. You need to pass the
|
||||
# NOTE: monthly_averages_list generated by get_monthly_averages() as the argument
|
||||
# NOTE: to this function and then calculate the 5-month exponential moving average (EMA) stock prices.
|
||||
# NOTE: In general, the EMA for a particular month can be calculated by the following formula:
|
||||
# NOTE:
|
||||
# NOTE: EMA = (Monthly average price – previous month’s EMA) x smoothing constant + previous month’s EMA
|
||||
# NOTE:
|
||||
# NOTE: where
|
||||
# NOTE:
|
||||
# NOTE: smoothing constant = 2 / (number of time periods in months + 1)
|
||||
# NOTE:
|
||||
# NOTE: Initial SMA = 20-period sum / 20
|
||||
# NOTE: Multiplier = (2 / (Time periods + 1) ) = (2 / (20 + 1) ) = 0.0952(9.52%)
|
||||
# NOTE: EMA = {Close – EMA(previous day)} x multiplier + EMA(previous day).
|
||||
def get_monthly_average(data_list, month_wanted):
|
||||
'''
|
||||
get monthly average from the list
|
||||
input:
|
||||
data_list: data_list
|
||||
month_wanted: YYYY-MM
|
||||
'''
|
||||
return list(filter(lambda d: d[C_DATE][0:7] == month_wanted, data_list) )[0][C_MONTH_AVG_PRICE]
|
||||
|
||||
|
||||
def get_SMA(data_list, month_to_get_SMA):
|
||||
'''calculate SMA from the beginning(oldest) of the list
|
||||
input:
|
||||
data_list: data_list
|
||||
month_to_get_SMA : number of month to initialize the SMA (i.e. 5)
|
||||
'''
|
||||
sum_of_months = 0
|
||||
|
||||
for month in month_to_get_SMA:
|
||||
sum_of_months = sum_of_months + get_monthly_average(data_list, month)
|
||||
|
||||
return sum_of_months / len(month_to_get_SMA)
|
||||
|
||||
def get_extreme_EMA(ema_list, max_min= 'min', skip_month=0):
|
||||
'''get max/min EMA from the list
|
||||
input:
|
||||
ema_list: month list with ema
|
||||
max_min: max / min selector (default: min)
|
||||
skip_month: month to skip as initialized as SMA (i.e. the first 5 month)
|
||||
'''
|
||||
if (max_min == 'max'):
|
||||
return max(map(lambda r: r[2], ema_list[skip_month:]))
|
||||
|
||||
return min(map(lambda r: r[2], ema_list[skip_month:]))
|
||||
|
||||
def get_month_by_EMA(ema_list, ema_value):
|
||||
'''get months(value) specified by the EMA value wanted
|
||||
input:
|
||||
ema_list: month list with ema
|
||||
ema_value: ema value to select the month (i.e. max EMA)
|
||||
'''
|
||||
return list(map(lambda r: r[0], filter(lambda x: x[2] == ema_value, ema_list)))
|
||||
|
||||
def get_output_content(max_ema, min_ema, max_ema_months, min_ema_months, report_name=""):
|
||||
'''get the output content, return with a formatted string
|
||||
input:
|
||||
max_ema: max ema to report
|
||||
min_ema: min ema to report
|
||||
max_ema_months: month(s) to report with max ema
|
||||
min_ema_months: month(s) to report with min ema
|
||||
'''
|
||||
# reformat to MM-YYYY before out to file
|
||||
reformat_max_ema_months = list(map(lambda m: m.split('-')[1]+'-'+m.split('-')[0] , max_ema_months))
|
||||
reformat_min_ema_months = list(map(lambda m: m.split('-')[1]+'-'+m.split('-')[0] , min_ema_months))
|
||||
|
||||
return '''
|
||||
# The best month for {report_name}:
|
||||
# {best_ema_months}, {best_EMA}
|
||||
|
||||
# The worst month for {report_name}:
|
||||
# {worst_ema_months}, {worst_EMA}
|
||||
'''.format(
|
||||
best_ema_months=','.join(reformat_max_ema_months),
|
||||
best_EMA=round(max_ema, 2),
|
||||
worst_ema_months=','.join(reformat_min_ema_months),
|
||||
worst_EMA=round(min_ema, 2),
|
||||
report_name=report_name).strip()
|
||||
|
||||
|
||||
def get_moving_averages(monthly_averages_list):
|
||||
'''get moving averages
|
||||
input:
|
||||
monthly_averages_list
|
||||
'''
|
||||
month_available = get_available_month(monthly_averages_list)
|
||||
# NOTE: initialize first 0 to 4 SMA
|
||||
monthly_averages_list_w_EMA = [[c, get_monthly_average(monthly_averages_list, c)] for c in month_available]
|
||||
initial_SMA = sum(map(lambda x: x[1], monthly_averages_list_w_EMA[0:5]))/5
|
||||
|
||||
smoothing_constant = 2 / (5 + 1)
|
||||
|
||||
for i in range(0,len(monthly_averages_list_w_EMA)):
|
||||
if (i < 5):
|
||||
# first 5 month were given by SMA
|
||||
monthly_averages_list_w_EMA[i].append( initial_SMA)
|
||||
|
||||
else:
|
||||
month_average_this_month = monthly_averages_list_w_EMA[i][1]
|
||||
EMA_last_month = monthly_averages_list_w_EMA[i-1][2]
|
||||
EMA_this_month = (month_average_this_month - EMA_last_month) * smoothing_constant + EMA_last_month
|
||||
|
||||
monthly_averages_list_w_EMA[i].append( EMA_this_month )
|
||||
|
||||
return monthly_averages_list_w_EMA
|
||||
|
||||
# get input from user
|
||||
csv_filepath = input("Please input a csv filename: ")
|
||||
|
||||
try:
|
||||
# NOTE: get csv file from user
|
||||
csv_filename = csv_filepath
|
||||
txt_filename = csv_filename.split('.csv')[0]+'_output.txt'
|
||||
report_name = os.path.basename(csv_filename).replace('.csv','')
|
||||
|
||||
# NOTE: process file
|
||||
data_list = get_data_list(csv_filename)
|
||||
monthly_average_list = get_monthly_averages(data_list)
|
||||
ema_list = get_moving_averages(monthly_average_list)
|
||||
|
||||
# NOTE: output txt file
|
||||
max_ema = get_extreme_EMA(ema_list,'max', 5)
|
||||
min_ema = get_extreme_EMA(ema_list, 'min',5)
|
||||
best_ema_months = get_month_by_EMA(ema_list, max_ema)
|
||||
worst_ema_months = get_month_by_EMA(ema_list, min_ema)
|
||||
|
||||
output_string = get_output_content(max_ema, min_ema, best_ema_months, worst_ema_months, report_name)
|
||||
|
||||
with open(txt_filename, 'w+') as f_output:
|
||||
f_output.truncate(0)
|
||||
f_output.writelines(output_string)
|
||||
|
||||
print('output wrote '+txt_filename)
|
||||
print('done !')
|
||||
|
||||
except IsADirectoryError as e:
|
||||
# NOTE: if input is a directory, drop here
|
||||
print('sorry the path is a directory')
|
||||
|
||||
except FileNotFoundError as e:
|
||||
# NOTE: if csv file not found, drop here
|
||||
print('sorry cannot find the file wanted')
|
||||
|
||||
except Exception as e:
|
||||
# # cast outside if exception definition not found
|
||||
raise e
|
6
2nd_copy/src/test.sh
Normal file
6
2nd_copy/src/test.sh
Normal file
@@ -0,0 +1,6 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
set -ex
|
||||
|
||||
clear
|
||||
python3 ./main.py
|
Reference in New Issue
Block a user