update,
This commit is contained in:
114
1st_copy/NOTES.md
Normal file
114
1st_copy/NOTES.md
Normal file
@@ -0,0 +1,114 @@
|
||||
### objective
|
||||
|
||||
env: Windows
|
||||
deadline 23/12
|
||||
|
||||
|
||||
### CAUTION
|
||||
Do not include any code not written by you in your
|
||||
project. You are NOT allowed to import any Python libraries in your solution except the
|
||||
modules namely os (https://docs.python.org/3/library/os.html), sys
|
||||
(https://docs.python.org/3/library/sys.html) and csv (https://docs.python.org/3/library/csv.html). If
|
||||
cheating is found or the import requirement is violated, you will receive a zero mark.
|
||||
|
||||
|
||||
### Deliverable
|
||||
You have to include your student name and ID in your source code and name your project solution as
|
||||
“XXXXXXXX_project.py” (where XXXXXXXX is your 8-digit student ID). Please remember to
|
||||
upload your source code solution to Moodle by the submission deadline.
|
||||
|
||||
|
||||
### drill down
|
||||
|
||||
Functions
|
||||
Given the file of stock prices, you are asked to develop a Python program to process the data by
|
||||
designing appropriate functions. At minimum you need to implement and call the following three
|
||||
functions:
|
||||
• get_data_list(csv_file_name)
|
||||
This function has one parameter, namely csv_file_name. When the function is called, you
|
||||
need to pass along a CSV file name which is used inside the function to open and read the CSV
|
||||
file. After reading each row, it will be split into a list. The list will then be appended into a main
|
||||
list (a list of lists), namely data_list. The data_list will be returned at the end of the
|
||||
function.
|
||||
2
|
||||
• get_monthly_averages(data_list)
|
||||
This function has one parameter, namely data_list. You need to pass the data_list
|
||||
generated by the get_data_list() function as the argument to this function and then
|
||||
calculate the monthly average prices of the stock. The average monthly prices are calculated in
|
||||
the following way. Suppose the volume and adjusted closing price of a trading day are V1 and C1,
|
||||
respectively. The total sale of that day equals V1 x C1. Now, suppose the volume and adjusted
|
||||
closing price of another trading day are V2 and C2, respectively. The average of these two trading
|
||||
days is the sum of the total sales divided by the total volume:
|
||||
Average price = (V1 x C1 + V2 x C2) / (V1 + V2)
|
||||
To average a whole month, you need to add up the total sales (V1 x C1 + V2 x C2 + ... +
|
||||
Vn x Cn) for each day and divide it by the sum of all volumes (V1 + V2 + ... + Vn) where n
|
||||
is the number of trading days in the month.
|
||||
A tuple with 2 items, including the date (year and month only) and the average for that month,
|
||||
will be generated for each month. The tuple for each month will be appended to a main list,
|
||||
namely monthly_averages_list. The monthly_averages_list will be returned at
|
||||
the end of the function.
|
||||
• get_moving_averages(monthly_averages_list)
|
||||
This function has one parameter, namely monthly_averages_list. You need to pass the
|
||||
monthly_averages_list generated by get_monthly_averages() as the argument
|
||||
to this function and then calculate the 5-month exponential moving average (EMA) stock prices.
|
||||
In general, the EMA for a particular month can be calculated by the following formula:
|
||||
EMA = (Monthly average price – previous month’s EMA) x smoothing constant
|
||||
+ previous month’s EMA
|
||||
where
|
||||
smoothing constant = 2 / (number of time periods in months + 1)
|
||||
3
|
||||
For example, the following table shows the stock prices between Oct 2020 and Apr 2021:
|
||||
Month Monthly Average Price
|
||||
Oct 2020 14
|
||||
Nov 2020 13
|
||||
Dec 2020 14
|
||||
Jan 2021 12
|
||||
Feb 2021 13
|
||||
Mar 2021 12
|
||||
Apr 2021 11
|
||||
The initial 5-month EMA for Feb 2021 can be calculated by the simple average formula, as
|
||||
shown below:
|
||||
5-month EMA for Feb 2021 = (14 + 13 + 14 + 12 + 13) / 5 = 13.2
|
||||
The 5-month EMA for Mar 2021 can be calculated by the EMA formula, as shown below:
|
||||
5-month EMA for Mar 2021 = (Monthly average price – previous month’s EMA) x
|
||||
smoothing constant + previous month’s EMA
|
||||
= (12 – 13.2) x (2 / 6) + 13.2
|
||||
= 12.8
|
||||
The 5-month EMA for Apr 2021 can be calculated by the EMA formula, as shown below:
|
||||
5-month EMA for Apr 2021 = (Monthly average price – previous month’s EMA) x
|
||||
smoothing constant + previous month’s EMA
|
||||
= (11 – 12.8) x (2 / 6) + 12.8
|
||||
= 12.2
|
||||
The resulting 5-month EMA stock prices are shown below:
|
||||
Month Average Price 5-month EMA Price
|
||||
Oct 2020 14 -
|
||||
Nov 2020 13 -
|
||||
Dec 2020 14 -
|
||||
Jan 2021 12 -
|
||||
Feb 2021 13 13.2
|
||||
Mar 2021 12 12.8
|
||||
Apr 2021 11 12.2
|
||||
4
|
||||
A tuple with 2 items, including the date (year and month only) and the 5-month EMA price for
|
||||
that month, will be generated for each month except the first 4 months. Each tuple will be
|
||||
appended to a main list, namely moving_averages_list. The
|
||||
moving_averages_list will be returned at the end of the function.
|
||||
|
||||
|
||||
Program Input and Output
|
||||
At the outset, your program needs to ask the user for a CSV file name:
|
||||
Based on the entered CSV file name, a corresponding output text file (e.g. “Google_output.txt”
|
||||
for this case) will be generated. In the output file, you are eventually required to print the best month
|
||||
(with the highest EMA price) and the worst month (with the lowest EMA price) for the stock. You
|
||||
need to first print a header line for the stock, and then print a date (MM-YYYY), a comma followed by
|
||||
a moving average price (in 2 decimal places) on another line. You must follow the output format as
|
||||
shown below (please note the values are not true, which are for reference only)
|
||||
|
||||
|
||||
IV. Evaluation Criteria (40% of Overall Course Assessment)
|
||||
The project will be graded using the following criteria:
|
||||
• 15% - Correctness of program execution and output data
|
||||
• 10% - Modularization (e.g. dividing the program functionality into different functions)
|
||||
• 5% - Error handling
|
||||
• 5% - Consistent style (e.g., capitalization, indenting, etc.)
|
||||
• 5% - Appropriate comments
|
Binary file not shown.
1031
1st_copy/_ref/google.csv
Normal file
1031
1st_copy/_ref/google.csv
Normal file
File diff suppressed because it is too large
Load Diff
BIN
1st_copy/_ref/steps.ods
Normal file
BIN
1st_copy/_ref/steps.ods
Normal file
Binary file not shown.
15
1st_copy/_ref/test.csv
Normal file
15
1st_copy/_ref/test.csv
Normal file
@@ -0,0 +1,15 @@
|
||||
Date,Open,High,Low,Close,Adj Close,Volume
|
||||
2008-07-04,460,463.24,449.4,450.26,450.26,4848500
|
||||
2008-07-03,468.73,474.29,459.58,464.41,464.41,4314600
|
||||
2008-06-02,476.77,482.18,461.42,465.25,465.25,6111500
|
||||
2008-06-29,469.75,471.01,462.33,463.29,463.29,3848200
|
||||
2008-05-08,452.02,452.94,417.55,419.95,419.95,9017900
|
||||
2008-05-05,445.49,452.46,440.08,444.25,444.25,4534300
|
||||
2008-04-04,460,463.24,449.4,450.26,450.26,4848500
|
||||
2008-04-03,468.73,474.29,459.58,464.41,464.41,4314600
|
||||
2008-03-02,476.77,482.18,461.42,465.25,465.25,6111500
|
||||
2008-03-29,469.75,471.01,462.33,463.29,463.29,3848200
|
||||
2008-02-28,472.49,476.45,470.33,473.78,473.78,3029700
|
||||
2008-02-27,473.73,474.83,464.84,468.58,468.58,4387100
|
||||
2008-01-28,472.49,476.45,470.33,473.78,473.78,3029700
|
||||
2008-01-27,473.73,474.83,464.84,468.58,468.58,4387100
|
|
16
1st_copy/build.sh
Normal file
16
1st_copy/build.sh
Normal file
@@ -0,0 +1,16 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
rm -rf _temp/*
|
||||
rm -rf delivery.zip
|
||||
|
||||
mkdir -p _temp
|
||||
|
||||
set -ex
|
||||
|
||||
cp src/main.py _temp/XXXXXXXX_project.py
|
||||
|
||||
pushd _temp
|
||||
7za a -tzip ../delivery1.zip *
|
||||
popd
|
||||
|
||||
rm -rf _temp
|
11
1st_copy/src/Pipfile
Normal file
11
1st_copy/src/Pipfile
Normal file
@@ -0,0 +1,11 @@
|
||||
[[source]]
|
||||
url = "https://pypi.org/simple"
|
||||
verify_ssl = true
|
||||
name = "pypi"
|
||||
|
||||
[packages]
|
||||
|
||||
[dev-packages]
|
||||
|
||||
[requires]
|
||||
python_version = "3.11"
|
243
1st_copy/src/main.py
Normal file
243
1st_copy/src/main.py
Normal file
@@ -0,0 +1,243 @@
|
||||
#!/usr/bin/env python3
|
||||
|
||||
# Do not include any code not written by you in your project.
|
||||
# You are NOT allowed to import any Python libraries in your solution except the modules namely
|
||||
# - os (https://docs.python.org/3/library/os.html),
|
||||
# - sys (https://docs.python.org/3/library/sys.html) and
|
||||
# - csv (https://docs.python.org/3/library/csv.html).
|
||||
#
|
||||
# If cheating is found or the import requirement is violated, you will receive a zero mark.
|
||||
|
||||
import os,sys, csv
|
||||
|
||||
# column from csv file
|
||||
# COL_DATE: the day of trading
|
||||
# COL_OPEN: the stock price at the beginning of the trading day
|
||||
# COL_HIGH: the highest price the stock achieved on the trading day
|
||||
# COL_LOW: the lowest price the stock achieved on the trading day
|
||||
# COL_CLOSE: the stock price at the end of the trading day
|
||||
# COL_ADJ_Close: the adjusted closing price of the trading day (reflecting the stock’s value after accounting for any corporate actions like dividends, stock splits and new stock offerings)
|
||||
# COL_VOLUME: the total number of shares were traded on the trading day
|
||||
COL_DATE=0
|
||||
COL_OPEN=1
|
||||
COL_HIGH=2
|
||||
COL_LOW=3
|
||||
COL_CLOSE=4
|
||||
COL_ADJ_CLOSE=5
|
||||
COL_VOLUME=6
|
||||
|
||||
# append at middle stage
|
||||
COL_TOTAL_SALE_OF_DAY=7
|
||||
COL_MONTH_ONLY=8
|
||||
COL_EMA=9
|
||||
|
||||
# monthly_averages_list
|
||||
COL_MONTHLY_AVERAGE_PRICE=1
|
||||
COL_EMA=2
|
||||
|
||||
|
||||
# get_data_list(csv_file_name)
|
||||
# This function has one parameter, namely csv_file_name.
|
||||
# When the function is called, you need to pass along a CSV file name which is used inside the function to open and read the CSV
|
||||
# file.
|
||||
# After reading each row, it will be split into a list. The list will then be appended into a main
|
||||
# list (a list of lists), namely data_list. The data_list will be returned at the end of the
|
||||
# function.
|
||||
def get_data_list(csv_file_name):
|
||||
'''read data list from csv file'''
|
||||
data_list = []
|
||||
try:
|
||||
with open(csv_file_name, newline='') as csvfile:
|
||||
temp = []
|
||||
temp = csv.reader(csvfile, delimiter=',', quotechar='"')
|
||||
data_list = list(temp)
|
||||
|
||||
return data_list
|
||||
except Exception as e:
|
||||
print('error during reading csv file ')
|
||||
print('exitting...')
|
||||
sys.exit()
|
||||
|
||||
|
||||
# get_monthly_averages(data_list)
|
||||
# This function has one parameter, namely data_list. You need to pass the data_list
|
||||
# generated by the get_data_list() function as the argument to this function and then
|
||||
# calculate the monthly average prices of the stock. The average monthly prices are calculated in
|
||||
# the following way.
|
||||
#
|
||||
# 1. Suppose the volume and adjusted closing price of a trading day are V1 and C1, respectively.
|
||||
# 2. The total sale of that day equals V1 x C1.
|
||||
# 3. Now, suppose the volume and adjusted closing price of another trading day are V2 and C2, respectively.
|
||||
# 4. The average of these two trading days is the sum of the total sales divided by the total volume:
|
||||
#
|
||||
# Average price = (V1 x C1 + V2 x C2) / (V1 + V2)
|
||||
#
|
||||
# To average a whole month, you need to
|
||||
# - add up the total sales (V1 x C1 + V2 x C2 + ... + Vn x Cn) for each day and
|
||||
# - divide it by the sum of all volumes (V1 + V2 + ... + Vn) where n is the number of trading days in the month.
|
||||
# A tuple with 2 items, including the date (year and month only) and the average for that month,
|
||||
# will be generated for each month. The tuple for each month will be appended to a main list,
|
||||
# namely monthly_averages_list. The monthly_averages_list will be returned at the end of the function.
|
||||
|
||||
def get_monthly_averages(data_list):
|
||||
'''calculate the monthly average prices of the stock'''
|
||||
|
||||
monthly_averages_list=[]
|
||||
data_list_data_only = data_list[1:]
|
||||
month_available = []
|
||||
|
||||
# data cleaning
|
||||
for i in range(len(data_list_data_only)):
|
||||
# V1 x C1, calculate the total sale, append into column
|
||||
data_list_data_only[i].append(float(data_list_data_only[i][COL_VOLUME]) * float(data_list_data_only[i][COL_ADJ_CLOSE]))
|
||||
|
||||
# mark the row by YYYY-MM for easy monthly sum calculation, COL_MONTH_ONLY
|
||||
data_list_data_only[i].append(data_list_data_only[i][COL_DATE][0:7])
|
||||
|
||||
# get the month in the list YYYY-MM
|
||||
month_available = set(list(map(lambda x: x[COL_MONTH_ONLY], data_list_data_only)))
|
||||
|
||||
# literate the whole list, calculate the total_sale and total volume
|
||||
# get the average sale by total_sale / total_volume
|
||||
for month in sorted(month_available):
|
||||
filtered_month = list(filter(lambda x: x[COL_MONTH_ONLY] == month, data_list_data_only))
|
||||
total_sale = sum(list( map(lambda x: x[COL_TOTAL_SALE_OF_DAY], filtered_month)))
|
||||
total_volume = sum(list( map(lambda x: float(x[COL_VOLUME]), filtered_month)))
|
||||
monthly_averages_list.append([month, total_sale/total_volume])
|
||||
|
||||
return list(monthly_averages_list)
|
||||
|
||||
# get_moving_averages(monthly_averages_list)
|
||||
# This function has one parameter, namely monthly_averages_list. You need to pass the
|
||||
# monthly_averages_list generated by get_monthly_averages() as the argument
|
||||
# to this function and then calculate the 5-month exponential moving average (EMA) stock prices.
|
||||
# In general, the EMA for a particular month can be calculated by the following formula:
|
||||
#
|
||||
# EMA = (Monthly average price – previous month’s EMA) x smoothing constant + previous month’s EMA
|
||||
#
|
||||
# where
|
||||
#
|
||||
# smoothing constant = 2 / (number of time periods in months + 1)
|
||||
#
|
||||
# Initial SMA = 20-period sum / 20
|
||||
# Multiplier = (2 / (Time periods + 1) ) = (2 / (20 + 1) ) = 0.0952(9.52%)
|
||||
# EMA = {Close – EMA(previous day)} x multiplier + EMA(previous day).
|
||||
def get_moving_averages(monthly_averages_list):
|
||||
'''
|
||||
get moving averages from montyly_average_list
|
||||
input:
|
||||
[ [YYYY-MM, monthly average price],
|
||||
[YYYY-MM, monthly average price],
|
||||
...]
|
||||
|
||||
output:
|
||||
[ [YYYY-MM, monthly average price, EMA],
|
||||
[YYYY-MM, monthly average price, EMA],
|
||||
...]
|
||||
'''
|
||||
|
||||
# by ref, the first 5 month EMA were given by SMA
|
||||
monthly_averages_list[0].append(sum(map(lambda x: x[1], monthly_averages_list[0:5]))/5)
|
||||
monthly_averages_list[1].append(sum(map(lambda x: x[1], monthly_averages_list[0:5]))/5)
|
||||
monthly_averages_list[2].append(sum(map(lambda x: x[1], monthly_averages_list[0:5]))/5)
|
||||
monthly_averages_list[3].append(sum(map(lambda x: x[1], monthly_averages_list[0:5]))/5)
|
||||
monthly_averages_list[4].append(sum(map(lambda x: x[1], monthly_averages_list[0:5]))/5)
|
||||
|
||||
# smoothing constant = 2 / (number of time periods in months + 1)
|
||||
smoothing_constant = 2 / (5 + 1)
|
||||
|
||||
# main loop to calculate EMA, start from the 6th month available till the end of the list
|
||||
for i in range(5, len(monthly_averages_list)):
|
||||
previous_month_EMA = monthly_averages_list[i-1][2]
|
||||
Monthly_average_price = monthly_averages_list[i][1]
|
||||
|
||||
EMA = (Monthly_average_price - previous_month_EMA) * smoothing_constant + previous_month_EMA
|
||||
monthly_averages_list[i].append(EMA)
|
||||
|
||||
return monthly_averages_list
|
||||
|
||||
# Based on the entered CSV file name, a corresponding output text file (e.g. “Google_output.txt” for this case) will be generated.
|
||||
# In the output file, you are eventually required to print:
|
||||
# - the best month (with the highest EMA price) and
|
||||
# - the worst month (with the lowest EMA price) for the stock.
|
||||
# You need to first print a header line for the stock, and then print a date (MM-YYYY),
|
||||
# a comma followed by a moving average price (in 2 decimal places) on another line.
|
||||
def format_date_string(yyyy_mm):
|
||||
'''rearrange date string from csv file YYYY-MM => MM-YYYY'''
|
||||
[yyyy, mm] = yyyy_mm.split('-')
|
||||
return '-'.join([mm, yyyy])
|
||||
|
||||
def write_output_file(filename_to_write, monthly_averages_list_w_ema, report_name):
|
||||
'''get output string from template and write to output file
|
||||
input:
|
||||
filename_to_write: txt file name with path to be written to
|
||||
monthly_averages_list_w_ema: list provided with EMA
|
||||
report_name: report name to be written to report
|
||||
'''
|
||||
|
||||
RESULT_TEMPLATE='''
|
||||
# The best month for ^report_name^:
|
||||
# ^best_month^, ^best_EMA^
|
||||
|
||||
# The worst month for ^report_name^:
|
||||
# ^worst_month^, ^worst_EMA^
|
||||
'''.strip()
|
||||
|
||||
# get the max EMA of the list
|
||||
best_EMA = max(map(lambda x: x[2], monthly_averages_list_w_ema[5:]))
|
||||
# get the month(s) by the EMA wanted
|
||||
best_months = list(map(lambda x: format_date_string(x[0]), filter(lambda x: x[2] == best_EMA, monthly_averages_list_w_ema[5:])))
|
||||
|
||||
# get the min(worst) EMA of the list
|
||||
worst_EMA = min(map(lambda x: x[2], monthly_averages_list_w_ema[5:]))
|
||||
# get the month(s) by the EMA wanted
|
||||
worst_months = list(map(lambda x: format_date_string(x[0]), filter(lambda x: x[2] == worst_EMA, monthly_averages_list_w_ema[5:])))
|
||||
|
||||
# assemble the output string
|
||||
result_string = RESULT_TEMPLATE
|
||||
result_string = result_string\
|
||||
.replace('^best_month^', ','.join(best_months))\
|
||||
.replace('^best_EMA^', str('%.2f' % best_EMA))\
|
||||
.replace('^worst_month^', ','.join(worst_months))\
|
||||
.replace('^worst_EMA^', str('%.2f' % worst_EMA)) \
|
||||
.replace('^report_name^', report_name)
|
||||
|
||||
# write output file
|
||||
with open(filename_to_write, 'w+') as file_write:
|
||||
file_write.truncate(0)
|
||||
file_write.writelines(result_string)
|
||||
|
||||
def main():
|
||||
# Main function starts here
|
||||
|
||||
print('start')
|
||||
|
||||
# gather csv file with path from user
|
||||
input_filename = input("Please input a csv filename: ")
|
||||
|
||||
csv_filename = os.path.basename(input_filename)
|
||||
csv_path = os.path.dirname(input_filename)
|
||||
|
||||
# transform to the output file path by csv file name got
|
||||
txt_filename = csv_filename.replace('.csv','_output.txt')
|
||||
if (csv_path != ''):
|
||||
txt_filename = '/'.join([csv_path, txt_filename])
|
||||
else:
|
||||
# by default keep into current directory
|
||||
txt_filename = '/'.join(['.', txt_filename])
|
||||
|
||||
# grep the corp_name from the filename google.csv => google
|
||||
corp_name = os.path.basename(input_filename).split('.')[0]
|
||||
|
||||
# process the data_list by csv file as stateed in assignment
|
||||
print(f'processing {csv_filename}')
|
||||
csv_list=get_data_list(input_filename)
|
||||
monthly_averages_list = get_monthly_averages(csv_list)
|
||||
monthly_averages_list_w_EMA = get_moving_averages(monthly_averages_list)
|
||||
|
||||
# write output file
|
||||
write_output_file(txt_filename, monthly_averages_list_w_EMA, corp_name)
|
||||
print('wrote to {file} done'.format(file = txt_filename))
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
6
1st_copy/src/test.sh
Normal file
6
1st_copy/src/test.sh
Normal file
@@ -0,0 +1,6 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
set -ex
|
||||
|
||||
|
||||
python3 ./main.py
|
Reference in New Issue
Block a user