Files
004_comission/max015/T10/T10/CDS1001T10.ipynb
louiscklaw acf9d862ff update,
2025-01-31 21:15:04 +08:00

1037 lines
140 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# CDS1001 Tutorial 10 Report"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Input your name and student ID in the cell below (<font color='red'>if a cell is not in edit mode, double click it</font>):"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Your name: \n",
"\n",
"Your student ID:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Objectives:\n",
"- Understand the basic process of acquiring and exploring data \n",
"- Be able to understand and apply the use of ``csv`` package to process CSV files automatically\n",
"- Be able to understand and apply the use of ``statistics`` package to obtain summary statistics of data\n",
"- Be able to understand and apply the use of ``matlplotlib`` package to draw basic charts for data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### **Instructions for the report**:\n",
"* Follow Section 1 and Section 2 of the tutorial instruction to launch Python IDLE through Anaconda Navigation.\n",
"* Refer to Section 2.2 of the tutorial instruction to open tutorial 10 report\n",
"* Complete Parts 1-2 led by the lecturer\n",
"* Follow Section 3 of the tutorial instruction to save the report and zip the report folder. The zip file is named as CDS1001T10Report{your student_id}.zip (e.g., if student_id is 1234567, then the zip file's name is CDS1001T10Report1234567.zip). <font color='red'>The zip file needs to include the following files:\n",
" - an .ipynb file of this tutorial report \n",
" - image files of flowcharts or screenshots used in this tutorial report </font> \n",
"* Submit the zip file of the report folder to the Moodle. The submission due date is **<font color='red'>28 Nov 2023, 11:55PM</font>**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Part 1 Acquiring Data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 1.1. Follow Slide 10 of Week 11 Lecture to create a csv file, which can contain any data but shall be different from the one in the slide. Copy the csv file in the cell below: (1 point)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<font color='red'>Edit this cell to copy the newly created csv file which shall be different from the one in Slide 10. </font>\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 1.2. Write and execute codes for the following questions 1-7. (14 points)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Question 1. How would you obtain a Reader object for a csv file example.csv and store it in a variable ``rd``? "
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [],
"source": [
"#Write your code here:\n",
"import csv\n",
"\n",
"exampleFile = open('example.csv')\n",
"rd = csv.reader(exampleFile)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Question 2. How would you retrieve the data in the csv file and store the data in a list variable ``data``?"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[['4/5/2014 13:34', 'Apples', '73', '4.5'], ['4/5/2014 3:41', 'Cherries', '85', '2.5'], ['4/6/2014 12:46', 'Pears', '14', '1.5'], ['4/8/2014 8:59', 'Oranges', '52', '5.5'], ['4/10/2014 2:07', 'Apples', '152', '3.5'], ['4/10/2014 18:10', 'Bananas', '23', '4.4'], ['4/10/2014 2:40', 'Strawberries', '98', '2.2']]\n"
]
}
],
"source": [
"#Write your code here:\n",
"data = list(rd)\n",
"print(data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Question 3. How would you retrieve and print the data at a row 3 and column 4 of the csv file?"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1.5\n"
]
}
],
"source": [
"#Write your code here:\n",
"\n",
"print(data[2][3])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Question 4. How would you traverse all the values of the csv file in a for loop?"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"4/5/2014 13:34 Apples 73 4.5 \n",
"4/5/2014 3:41 Cherries 85 2.5 \n",
"4/6/2014 12:46 Pears 14 1.5 \n",
"4/8/2014 8:59 Oranges 52 5.5 \n",
"4/10/2014 2:07 Apples 152 3.5 \n",
"4/10/2014 18:10 Bananas 23 4.4 \n",
"4/10/2014 2:40 Strawberries 98 2.2 \n"
]
}
],
"source": [
"#Write your code here:\n",
"\n",
"for row in data:\n",
" for col in row:\n",
" print(str(col), end=' ')\n",
" print()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Question 5. How would you write a row of numbers, '4/11/2014 12:44', 'watermelon', 100, 7.8, to a new csv file named 'new-example.csv'?"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [],
"source": [
"#Write your code here:\n",
"\n",
"outputFile = open('new-example.csv', 'w', newline='')\n",
"outputWriter = csv.writer(outputFile)\n",
"outputWriter.writerow(['4/11/2014 12:44', 'watermelon', 100, 7.8])\n",
"outputFile.close()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Question 6. How would you obtain a DictReader object for the csv file example.csv with header names equal to 'date', 'product', 'quantity', 'price', and store it in a variable ``rd``? "
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [],
"source": [
"#Write your code here:\n",
"\n",
"exampleFile = open('example.csv')\n",
"rd = csv.DictReader(exampleFile, ['date', 'product', 'quantity', 'price'])\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Question 7. How would you write the data to a new csv file named new-example2.csv with the header?"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"#Write your code here:\n",
"\n",
"outputFile = open('new-example2.csv', 'w', newline='')\n",
"outputDictWriter = csv.DictWriter(outputFile, ['date', 'product', 'quantity', 'price'])\n",
"outputDictWriter.writeheader()\n",
"outputDictWriter.writerow({'date': 'Alice', 'product': 'grape', 'quantity': 20, 'price': 99.9})\n",
"outputFile.close()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 1.4. Removing the Header from CSV Files"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Say you have the boring job of removing the first line from several hundred CSV files with filenames starting with 'NAICS'. Maybe youll be feeding them into an automated process that requires just the data and not the headers at the top of the columns. You could open each file in Excel, delete the first row, and resave the file—but that would take hours. Lets write a program to do it instead.\n",
"\n",
"The program will need to open every file with the .csv extension and with the filename starting with 'NAICS' in the current working directory, read in the contents of the file, and rewrite the contents without the first row to a file with 'headerRemoved' added to the file name. This will replace the old contents of the CSV file with the new, headless contents."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Task 1: What is the algorithm to automate this work? What are the data structures needed? How are you going to write the code? (5 points)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<font color='red'>Edit this cell to answer the question above. </font>\n",
"\n",
"Algorithm:\n",
" - list all the CSV files in the current directory\n",
" - ignoring the first line, read in the content of the files.\n",
" - Write the contents, skipping the first line, to a new csv file\n",
"\n",
"Data Structures:\n",
" - Use list to store the values\n",
"\n",
"Code:\n",
" - loop over a list of fields from os.listdir(), ignore the file not ended with `csv`\n",
" - create a csv reader object and read the content of the files\n",
" - use a flag variable isFirstRow to indicate whether the line should skip\n",
" - create a csv writer object to write the read-in data to the new file."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Task 2: Follow the steps below to write the code:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Step 1: Obtain a list of names of files in the current working directory (2 points)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The first thing your program needs to do is to execute the code in the cell below, which uses listdir() method in os module to obtain a list of names of all the files in the current working directory, and stored the list in variable file_names."
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['output2.csv', 'NAICS_data_4610.csv', 'NAICS_data_7388.csv', 'NAICS_data_2092.csv', 'NAICS_data_7427.csv', 'NAICS_data_5341.csv', 'NAICS_data_5899.csv', 'NAICS_data_1952.csv', 'NAICS_data_4436.csv', 'task2.py', 'NAICS_data_1218.csv', 'NAICS_data_8015.csv', 'NAICS_data_8085.csv', 'NAICS_data_9066.csv', 'reset.sh', 'NAICS_data_3237.csv', 'NAICS_data_9250.csv', 'NAICS_data_1048.csv', 'NAICS_data_7102.csv', 'NAICS_data_8545.csv', 'NAICS_data_4125.csv', 'NAICS_data_1814.csv', 'NAICS_data_6842.csv', 'NAICS_data_9103.csv', 'NAICS_data_3144.csv', 'NAICS_data_5060.csv', 'NAICS_data_9825.csv', '.ipynb_checkpoints', 'NAICS_data_8397.csv', 'NAICS_data_2828.csv', 'NAICS_data_4699.csv', 'images', 'NAICS_data_5992.csv', '.DS_Store', 'NAICS_data_1889.csv', 'NAICS_data_2993.csv', 'NAICS_data_7830.csv', 'NAICS_data_7845.csv', 'NAICS_data_2346.csv', 'NAICS_data_3044.csv', 'NAICS_data_2066.csv', 'NAICS_data_6329.csv', 'NAICS_data_4213.csv', 'NAICS_data_7338.csv', 'task2_part2.py', 'NAICS_data_7226.csv', 'NAICS_data_2988.csv', 'CDS1001T10.ipynb', 'NAICS_data_9139.csv', 'NAICS_data_8196.csv', 'NAICS_data_5890.csv', 'NAICS_data_8760.csv', 'NAICS_data_3075.csv', 'rates1.csv', 'NAICS_data_3495.csv', 'rates2.csv', 'NAICS_data_1817.csv', 'task2_2_2_task3.py', 'NAICS_data_2799.csv', 'NAICS_data_9834.csv', 'NAICS_data_1657.csv', 'NAICS_data_2994.csv', 'NAICS_data_6493.csv', 'NAICS_data_3197.csv', 'NAICS_data_7677.csv', 'NAICS_data_8832.csv', 'NAICS_data_5631.csv', 'NAICS_data_7833.csv', 'NAICS_data_6397.csv', 'NAICS_data_8499.csv', 'NAICS_data_7535.csv', 'NAICS_data_4938.csv', 'NAICS_data_8700.csv', 'NAICS_data_1751.csv', 'NAICS_data_8131.csv', 'NAICS_data_7642.csv', 'NAICS_data_5364.csv', 'NAICS_data_4215.csv', 'example.csv', 'NAICS_data_2183.csv', 'NAICS_data_9986.csv', 'NAICS_data_6161.csv', 'NAICS_data_7765.csv', 'NAICS_data_9165.csv', 'NAICS_data_5092.csv', 'new-example2.csv', '.gitignore', 'rates3.csv', 'NAICS_data_8522.csv', 'NAICS_data_1973.csv', 'NAICS_data_4031.csv', 'NAICS_data_4896.csv', 'NAICS_data_7028.csv', 'NAICS_data_6904.csv', 'NAICS_data_6700.csv', 'NAICS_data_7383.csv', 'NAICS_data_6637.csv', 'new-example.csv', 'NAICS_data_6335.csv', 'NAICS_data_3073.csv', 'NAICS_data_2427.csv', 'NAICS_data_6181.csv', 'NAICS_data_2648.csv', 'NAICS_data_3494.csv', 'NAICS_data_7138.csv', 'NAICS_data_5305.csv', 'NAICS_data_8749.csv', 'NAICS_data_3731.csv', 'NAICS_data_4329.csv', 'NAICS_data_2959.csv', 'NAICS_data_4618.csv', 'NAICS_data_9448.csv', 'NAICS_data_7913.csv', 'NAICS_data_8403.csv', 'task1.py', 'NAICS_data_4525.csv', 'NAICS_data_9012.csv']\n"
]
}
],
"source": [
"import os\n",
"\n",
"file_names = os.listdir()\n",
"print(file_names)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Step 2: Loop Through Each CSV File and Edit It (24 points)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the code below, write a for loop through each file name in file_names, if the file name starts with 'NAICS' and ends with \".csv\", do the followings:\n",
"- (1) Create an empty list named csvRows to store rows beyond the first row of the CSV file\n",
"- (2) Create a CSV reader object \n",
"- (3) Write a for loop to read each row of the CSV file, and for each row, if it is not the first row, append it to the list csvRows\n",
"- (4) create a CSV writer object for a new CSV file with its file name equal to 'headerRemoved'+file_name\n",
"- (5) Write a for loop to write each row in csvRows to the new csv file\n",
"- (6) Close the new CSV file and the oririnal CSV file"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"NAICS_data_4610.csv\n",
"NAICS_data_7388.csv\n",
"NAICS_data_2092.csv\n",
"NAICS_data_7427.csv\n",
"NAICS_data_5341.csv\n",
"NAICS_data_5899.csv\n",
"NAICS_data_1952.csv\n",
"NAICS_data_4436.csv\n",
"NAICS_data_1218.csv\n",
"NAICS_data_8015.csv\n",
"NAICS_data_8085.csv\n",
"NAICS_data_9066.csv\n",
"NAICS_data_3237.csv\n",
"NAICS_data_9250.csv\n",
"NAICS_data_1048.csv\n",
"NAICS_data_7102.csv\n",
"NAICS_data_8545.csv\n",
"NAICS_data_4125.csv\n",
"NAICS_data_1814.csv\n",
"NAICS_data_6842.csv\n",
"NAICS_data_9103.csv\n",
"NAICS_data_3144.csv\n",
"NAICS_data_5060.csv\n",
"NAICS_data_9825.csv\n",
"NAICS_data_8397.csv\n",
"NAICS_data_2828.csv\n",
"NAICS_data_4699.csv\n",
"NAICS_data_5992.csv\n",
"NAICS_data_1889.csv\n",
"NAICS_data_2993.csv\n",
"NAICS_data_7830.csv\n",
"NAICS_data_7845.csv\n",
"NAICS_data_2346.csv\n",
"NAICS_data_3044.csv\n",
"NAICS_data_2066.csv\n",
"NAICS_data_6329.csv\n",
"NAICS_data_4213.csv\n",
"NAICS_data_7338.csv\n",
"NAICS_data_7226.csv\n",
"NAICS_data_2988.csv\n",
"NAICS_data_9139.csv\n",
"NAICS_data_8196.csv\n",
"NAICS_data_5890.csv\n",
"NAICS_data_8760.csv\n",
"NAICS_data_3075.csv\n",
"NAICS_data_3495.csv\n",
"NAICS_data_1817.csv\n",
"NAICS_data_2799.csv\n",
"NAICS_data_9834.csv\n",
"NAICS_data_1657.csv\n",
"NAICS_data_2994.csv\n",
"NAICS_data_6493.csv\n",
"NAICS_data_3197.csv\n",
"NAICS_data_7677.csv\n",
"NAICS_data_8832.csv\n",
"NAICS_data_5631.csv\n",
"NAICS_data_7833.csv\n",
"NAICS_data_6397.csv\n",
"NAICS_data_8499.csv\n",
"NAICS_data_7535.csv\n",
"NAICS_data_4938.csv\n",
"NAICS_data_8700.csv\n",
"NAICS_data_1751.csv\n",
"NAICS_data_8131.csv\n",
"NAICS_data_7642.csv\n",
"NAICS_data_5364.csv\n",
"NAICS_data_4215.csv\n",
"NAICS_data_2183.csv\n",
"NAICS_data_9986.csv\n",
"NAICS_data_6161.csv\n",
"NAICS_data_7765.csv\n",
"NAICS_data_9165.csv\n",
"NAICS_data_5092.csv\n",
"NAICS_data_8522.csv\n",
"NAICS_data_1973.csv\n",
"NAICS_data_4031.csv\n",
"NAICS_data_4896.csv\n",
"NAICS_data_7028.csv\n",
"NAICS_data_6904.csv\n",
"NAICS_data_6700.csv\n",
"NAICS_data_7383.csv\n",
"NAICS_data_6637.csv\n",
"NAICS_data_6335.csv\n",
"NAICS_data_3073.csv\n",
"NAICS_data_2427.csv\n",
"NAICS_data_6181.csv\n",
"NAICS_data_2648.csv\n",
"NAICS_data_3494.csv\n",
"NAICS_data_7138.csv\n",
"NAICS_data_5305.csv\n",
"NAICS_data_8749.csv\n",
"NAICS_data_3731.csv\n",
"NAICS_data_4329.csv\n",
"NAICS_data_2959.csv\n",
"NAICS_data_4618.csv\n",
"NAICS_data_9448.csv\n",
"NAICS_data_7913.csv\n",
"NAICS_data_8403.csv\n",
"NAICS_data_4525.csv\n",
"NAICS_data_9012.csv\n"
]
}
],
"source": [
"import csv\n",
"\n",
"for file_name in file_names:\n",
" if file_name.startswith ('NAICS') and file_name.endswith('.csv'):\n",
" print(file_name)\n",
"\n",
" #(1) Create an empty list named csvRows to store rows beyond the first row of the CSV file\n",
" csvRows = []\n",
"\n",
" #(2) Create a CSV reader object named csv_reader\n",
" exampleFile = open(file_name)\n",
" csv_reader = csv.reader(exampleFile)\n",
"\n",
" #(3) Write a for loop to read each row of the CSV file, and for each row, \n",
" # if it is not the first row, append it to the list csvRows\n",
" for row in csv_reader:\n",
" if (csv_reader.line_num > 1):\n",
" csvRows.append(row)\n",
"\n",
" #(4) create a CSV writer object for a new CSV file with its file name equal to 'headerRemoved'+file_name\n",
" outputFile = open('headerRemoved'+file_name, 'w', newline='')\n",
" outputWriter = csv.writer(outputFile)\n",
"\n",
" #(5) Write a for loop to write each row in csvRows to the new csv file\n",
" for row in csvRows:\n",
" outputWriter.writerow(row)\n",
"\n",
" #(6) Close the new CSV file and the original CSV file\n",
" outputFile.close()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Task 3: Test the above codes, and paste the screenshots of any two newly created files in the cell below: (5 points)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<font color='red'>Edit this cell to illustrate the testing results, and paste the screenshots of any two newly created files. </font>\n",
"\n",
"![](./images/part1_task3_pic1.png)\n",
"\n",
"![](./images/part1_task3_pic2.png)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Part 2 Exploring Data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 2.1. Write codes for the following tasks: (16 points)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Task 1. Given a list y = [5.4, 8.0, 1.0, 6.3, 2.9, 7.4, 7.9, 9.8, 4.9, 7.4], write a code in the cell below to print the sum, the max value, the min value, the mean value, and the standard deviation of the list."
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Sum: 61.0\n",
"Max: 9.8\n",
"Min: 1.0\n",
"Mean: 6.1000000000000005\n",
"Standard Deviation: 2.627630956668848\n"
]
}
],
"source": [
"import statistics\n",
"\n",
"y = [5.4, 8.0, 1.0, 6.3, 2.9, 7.4, 7.9, 9.8, 4.9, 7.4]\n",
"\n",
"# Calculate the sum\n",
"sum_y = sum(y)\n",
"\n",
"# Calculate the max value\n",
"max_y = max(y)\n",
"\n",
"# Calculate the min value\n",
"min_y = min(y)\n",
"\n",
"# Calculate the mean value\n",
"mean_y = statistics.mean(y)\n",
"\n",
"# Calculate the standard deviation\n",
"std_dev_y = statistics.stdev(y)\n",
"\n",
"print(f\"Sum: {sum_y}\")\n",
"print(f\"Max: {max_y}\")\n",
"print(f\"Min: {min_y}\")\n",
"print(f\"Mean: {mean_y}\")\n",
"print(f\"Standard Deviation: {std_dev_y}\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Task 2. Given a list y = [5.4, 8.0, 1.0, 6.3, 2.9, 7.4, 7.9, 9.8, 4.9, 7.4], write a code to plot it in a bar chart."
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAh8AAAGdCAYAAACyzRGfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8WgzjOAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAV/klEQVR4nO3de2zV9f348Vdt5wFMqRNThFikJi4g6ORijIC36EgQzcwWN506o9uisSrYxFmGm5sbVN1GyGTW1CyOzaD8sTlZnJuNi+BlRqygxi0S54VGR9jFtF6WGuD8/vhmza8Wp7jPeR16eDySzx/ncz6e98sTkj7zPre6crlcDgCAJAdVewAA4MAiPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVA3VHuCD9uzZE2+++WY0NjZGXV1dtccBAD6Gcrkcb7/9dkyePDkOOui/723sd/Hx5ptvRktLS7XHAAA+gb6+vjjyyCP/6zX7XXw0NjZGxP8NP378+CpPAwB8HAMDA9HS0jL0d/y/2e/i4z8vtYwfP158AMAo83HeMuENpwBAKvEBAKQSHwBAqn2Oj02bNsW5554bkydPjrq6uvjNb34z7P5yuRzf/e53Y/LkyTF27Ng4/fTT48UXXyxqXgBglNvn+Hj33Xfjs5/9bKxZs2av9992222xatWqWLNmTWzevDmOOOKI+NznPhdvv/32/zwsADD67fOnXRYtWhSLFi3a633lcjlWr14dy5cvjy984QsREbF27dqYOHFirFu3Lq644or/bVoAYNQr9D0fr776auzYsSMWLlw4dK5UKsVpp50WTz755F7/m8HBwRgYGBh2AAC1q9D42LFjR0RETJw4cdj5iRMnDt33QZ2dndHU1DR0+HZTAKhtFfm0ywe/YKRcLn/ol44sW7Ys+vv7h46+vr5KjAQA7CcK/YbTI444IiL+bwdk0qRJQ+d37tw5YjfkP0qlUpRKpSLHAAD2Y4XufLS2tsYRRxwRPT09Q+fef//92LhxY8ybN6/IpQCAUWqfdz7eeeedePnll4duv/rqq7F169Y47LDDYsqUKbF06dJYuXJlHHPMMXHMMcfEypUrY9y4cfGVr3yl0MEBgNFpn+PjmWeeiTPOOGPodnt7e0REXHrppfHzn/88vvnNb8a///3vuOqqq+Ktt96Kk046KR5++OGP9St3AEDtqyuXy+VqD/H/GxgYiKampujv7/ertgAwSuzL3+9C33AKwP5vaseD1R5hhNduWVztEUjkh+UAgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIJT4AgFTiAwBIVXh87Nq1K2688cZobW2NsWPHxtFHHx0333xz7Nmzp+ilAIBRqKHoB7z11lvjzjvvjLVr18aMGTPimWeeicsuuyyamppiyZIlRS8HAIwyhcfHn/70p/j85z8fixcvjoiIqVOnxr333hvPPPNM0UsBAKNQ4S+7LFiwIB555JHYtm1bREQ899xz8fjjj8fZZ5+91+sHBwdjYGBg2AEA1K7Cdz5uuOGG6O/vj2nTpkV9fX3s3r07VqxYERdeeOFer+/s7Izvfe97RY9Rc6Z2PFjtEUZ47ZbF1R4BgFGo8J2P9evXxz333BPr1q2LZ599NtauXRs/+tGPYu3atXu9ftmyZdHf3z909PX1FT0SALAfKXzn4/rrr4+Ojo644IILIiLiuOOOi9dffz06Ozvj0ksvHXF9qVSKUqlU9BgAwH6q8J2P9957Lw46aPjD1tfX+6gtABARFdj5OPfcc2PFihUxZcqUmDFjRmzZsiVWrVoVl19+edFLAVSV92LBJ1N4fNx+++3x7W9/O6666qrYuXNnTJ48Oa644or4zne+U/RSAMAoVHh8NDY2xurVq2P16tVFPzQAUAP8tgsAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpGqo9AMDUjgerPcIIr92yuNojUCP8+x7JzgcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkEp8AACpxAcAkKqh2gPA/mhqx4PVHmGE125ZXO0RAAph5wMASCU+AIBU4gMASCU+AIBU4gMASCU+AIBUFYmPN954Iy6++OKYMGFCjBs3Lk444YTo7e2txFIAwChT+Pd8vPXWWzF//vw444wz4qGHHorm5ub461//GoceemjRSwEAo1Dh8XHrrbdGS0tL3H333UPnpk6dWvQyAMAoVfjLLhs2bIi5c+fG+eefH83NzTFr1qy46667PvT6wcHBGBgYGHYAALWr8Ph45ZVXoqurK4455pj4wx/+EFdeeWVce+218Ytf/GKv13d2dkZTU9PQ0dLSUvRIAMB+pPD42LNnT8yePTtWrlwZs2bNiiuuuCK+8Y1vRFdX116vX7ZsWfT39w8dfX19RY8EAOxHCo+PSZMmxbHHHjvs3PTp02P79u17vb5UKsX48eOHHQBA7So8PubPnx8vvfTSsHPbtm2Lo446quilAIBRqPD4uO666+Kpp56KlStXxssvvxzr1q2L7u7uaGtrK3opAGAUKjw+TjzxxLj//vvj3nvvjZkzZ8b3v//9WL16dVx00UVFLwUAjEKFf89HRMQ555wT55xzTiUeGgAY5fy2CwCQSnwAAKnEBwCQSnwAAKnEBwCQSnwAAKnEBwCQSnwAAKnEBwCQSnwAAKnEBwCQSnwAAKnEBwCQSnwAAKkaqj1AtqkdD1Z7hBFeu2VxtUcAgDR2PgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVA3VHgAAPo6pHQ9We4QRXrtlcbVHGJXsfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqSoeH52dnVFXVxdLly6t9FIAwChQ0fjYvHlzdHd3x/HHH1/JZQCAUaRi8fHOO+/ERRddFHfddVd8+tOfrtQyAMAoU7H4aGtri8WLF8dZZ51VqSUAgFGooRIPet9998Wzzz4bmzdv/shrBwcHY3BwcOj2wMBAJUYCAPYThe989PX1xZIlS+Kee+6JMWPGfOT1nZ2d0dTUNHS0tLQUPRIAsB8pPD56e3tj586dMWfOnGhoaIiGhobYuHFj/OQnP4mGhobYvXv3sOuXLVsW/f39Q0dfX1/RIwEA+5HCX3Y588wz44UXXhh27rLLLotp06bFDTfcEPX19cPuK5VKUSqVih4DANhPFR4fjY2NMXPmzGHnDjnkkJgwYcKI8wDAgcc3nAIAqSryaZcPevTRRzOWAQBGATsfAEAq8QEApBIfAEAq8QEApBIfAEAq8QEApBIfAEAq8QEApBIfAEAq8QEApBIfAEAq8QEApBIfAEAq8QEApBIfAEAq8QEApBIfAEAq8QEApBIfAEAq8QEApBIfAEAq8QEApBIfAEAq8QEApBIfAEAq8QEApBIfAEAq8QEApBIfAEAq8QEApBIfAEAq8QEApBIfAECqhmoPABRnaseD1R5hhNduWVztEYD9jJ0PACCV+AAAUokPACCV+AAAUokPACCV+AAAUokPACCV+AAAUokPACCV+AAAUokPACCV+AAAUokPACCV+AAAUokPACCV+AAAUokPACCV+AAAUokPACCV+AAAUokPACCV+AAAUokPACCV+AAAUokPACCV+AAAUokPACBV4fHR2dkZJ554YjQ2NkZzc3Ocd9558dJLLxW9DAAwShUeHxs3boy2trZ46qmnoqenJ3bt2hULFy6Md999t+ilAIBRqKHoB/z9738/7Pbdd98dzc3N0dvbG6eeemrRywEAo0zh8fFB/f39ERFx2GGH7fX+wcHBGBwcHLo9MDBQ6ZEAgCqq6BtOy+VytLe3x4IFC2LmzJl7vaazszOampqGjpaWlkqOBABUWUXj4+qrr47nn38+7r333g+9ZtmyZdHf3z909PX1VXIkAKDKKvayyzXXXBMbNmyITZs2xZFHHvmh15VKpSiVSpUaAwDYzxQeH+VyOa655pq4//7749FHH43W1tailwAARrHC46OtrS3WrVsXDzzwQDQ2NsaOHTsiIqKpqSnGjh1b9HIAwChT+Hs+urq6or+/P04//fSYNGnS0LF+/fqilwIARqGKvOwCAPBh/LYLAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJBKfAAAqcQHAJCqodoDUNumdjxY7RFGeO2WxdUeAeCAZucDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEglPgCAVOIDAEhVsfi44447orW1NcaMGRNz5syJxx57rFJLAQCjSEXiY/369bF06dJYvnx5bNmyJU455ZRYtGhRbN++vRLLAQCjSEXiY9WqVfG1r30tvv71r8f06dNj9erV0dLSEl1dXZVYDgAYRRqKfsD3338/ent7o6OjY9j5hQsXxpNPPjni+sHBwRgcHBy63d/fHxERAwMDRY8WERF7Bt+ryOP+Lz7O/6u5i2PuXObOZe5ctTz3J33Mcrn80ReXC/bGG2+UI6L8xBNPDDu/YsWK8mc+85kR1990003liHA4HA6Hw1EDR19f30e2QuE7H/9RV1c37Ha5XB5xLiJi2bJl0d7ePnR7z5498a9//SsmTJiw1+v3BwMDA9HS0hJ9fX0xfvz4ao9T8zzfuTzfuTzf+TznlVEul+Ptt9+OyZMnf+S1hcfH4YcfHvX19bFjx45h53fu3BkTJ04ccX2pVIpSqTTs3KGHHlr0WBUxfvx4/3ATeb5zeb5zeb7zec6L19TU9LGuK/wNpwcffHDMmTMnenp6hp3v6emJefPmFb0cADDKVORll/b29rjkkkti7ty5cfLJJ0d3d3ds3749rrzyykosBwCMIhWJjy9/+cvxz3/+M26++eb429/+FjNnzozf/e53cdRRR1ViuXSlUiluuummES8XURme71ye71ye73ye8+qrK5c/zmdiAACK4bddAIBU4gMASCU+AIBU4gMASCU+PoE77rgjWltbY8yYMTFnzpx47LHHqj1STers7IwTTzwxGhsbo7m5Oc4777x46aWXqj3WAaOzszPq6upi6dKl1R6lZr3xxhtx8cUXx4QJE2LcuHFxwgknRG9vb7XHqkm7du2KG2+8MVpbW2Ps2LFx9NFHx8033xx79uyp9mgHJPGxj9avXx9Lly6N5cuXx5YtW+KUU06JRYsWxfbt26s9Ws3ZuHFjtLW1xVNPPRU9PT2xa9euWLhwYbz77rvVHq3mbd68Obq7u+P444+v9ig166233or58+fHpz71qXjooYfiz3/+c/z4xz8eNd/wPNrceuutceedd8aaNWviL3/5S9x2223xwx/+MG6//fZqj3ZA8lHbfXTSSSfF7Nmzo6ura+jc9OnT47zzzovOzs4qTlb7/v73v0dzc3Ns3LgxTj311GqPU7PeeeedmD17dtxxxx3xgx/8IE444YRYvXp1tceqOR0dHfHEE0/YOU1yzjnnxMSJE+NnP/vZ0LkvfvGLMW7cuPjlL39ZxckOTHY+9sH7778fvb29sXDhwmHnFy5cGE8++WSVpjpw9Pf3R0TEYYcdVuVJaltbW1ssXrw4zjrrrGqPUtM2bNgQc+fOjfPPPz+am5tj1qxZcdddd1V7rJq1YMGCeOSRR2Lbtm0REfHcc8/F448/HmeffXaVJzswVexXbWvRP/7xj9i9e/eIH8ibOHHiiB/So1jlcjna29tjwYIFMXPmzGqPU7Puu+++ePbZZ2Pz5s3VHqXmvfLKK9HV1RXt7e3xrW99K55++um49tpro1QqxVe/+tVqj1dzbrjhhujv749p06ZFfX197N69O1asWBEXXnhhtUc7IImPT6Curm7Y7XK5POIcxbr66qvj+eefj8cff7zao9Ssvr6+WLJkSTz88MMxZsyYao9T8/bs2RNz586NlStXRkTErFmz4sUXX4yuri7xUQHr16+Pe+65J9atWxczZsyIrVu3xtKlS2Py5Mlx6aWXVnu8A4742AeHH3541NfXj9jl2Llz54jdEIpzzTXXxIYNG2LTpk1x5JFHVnucmtXb2xs7d+6MOXPmDJ3bvXt3bNq0KdasWRODg4NRX19fxQlry6RJk+LYY48ddm769Onxq1/9qkoT1bbrr78+Ojo64oILLoiIiOOOOy5ef/316OzsFB9V4D0f++Dggw+OOXPmRE9Pz7DzPT09MW/evCpNVbvK5XJcffXV8etf/zr++Mc/Rmtra7VHqmlnnnlmvPDCC7F169ahY+7cuXHRRRfF1q1bhUfB5s+fP+Kj49u2bauZH+Dc37z33ntx0EHD/+TV19f7qG2V2PnYR+3t7XHJJZfE3Llz4+STT47u7u7Yvn17XHnlldUerea0tbXFunXr4oEHHojGxsahHaempqYYO3ZslaerPY2NjSPeT3PIIYfEhAkTvM+mAq677rqYN29erFy5Mr70pS/F008/Hd3d3dHd3V3t0WrSueeeGytWrIgpU6bEjBkzYsuWLbFq1aq4/PLLqz3aganMPvvpT39aPuqoo8oHH3xwefbs2eWNGzdWe6SaFBF7Pe6+++5qj3bAOO2008pLliyp9hg167e//W155syZ5VKpVJ42bVq5u7u72iPVrIGBgfKSJUvKU6ZMKY8ZM6Z89NFHl5cvX14eHBys9mgHJN/zAQCk8p4PACCV+AAAUokPACCV+AAAUokPACCV+AAAUokPACCV+AAAUokPACCV+AAAUokPACCV+AAAUv0/Xn+FsnNuE60AAAAASUVORK5CYII=",
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"#edit code for task 2\n",
"import matplotlib.pyplot as plt\n",
"\n",
"y = [5.4, 8.0, 1.0, 6.3, 2.9, 7.4, 7.9, 9.8, 4.9, 7.4]\n",
"\n",
"x = range(len(y))\n",
"plt.bar(x, y)\n",
"\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Task 3. Given a list y = [5.4, 8.0, 1.0, 6.3, 2.9, 7.4, 7.9, 9.8, 4.9, 7.4], write a code to plot it in a line chart in red."
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"#edit code for task 3\n",
"\n",
"import matplotlib.pyplot as plt\n",
"\n",
"y = [5.4, 8.0, 1.0, 6.3, 2.9, 7.4, 7.9, 9.8, 4.9, 7.4]\n",
"x = range(len(y))\n",
"\n",
"plt.plot(x, y, color='red')\n",
"plt.xlabel('X-axis')\n",
"plt.ylabel('Y-axis')\n",
"plt.title('Line Chart of y')\n",
"\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Task 4. Given two list \n",
"y1 = [5.4, 8.0, 1.0, 6.3, 2.9, 7.4, 7.9, 9.8, 4.9, 7.4] and \n",
"y2 = [5.49, 7.58, 1.5, 6.74, 3.33, 7.17, 7.53, 10.18, 4.63, 6.91], \n",
"\n",
"write a code to plot it in a scatterplot."
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"#edit code for task 1\n",
"import matplotlib.pyplot as plt\n",
"\n",
"# Lists y1 and y2\n",
"y1 = [5.4, 8.0, 1.0, 6.3, 2.9, 7.4, 7.9, 9.8, 4.9, 7.4]\n",
"y2 = [5.49, 7.58, 1.5, 6.74, 3.33, 7.17, 7.53, 10.18, 4.63, 6.91]\n",
"\n",
"# Create x-axis values (0 to len(y1)-1)\n",
"x = range(len(y1))\n",
"\n",
"# Plotting the scatterplot\n",
"plt.scatter(x, y1, color='red', label='y1')\n",
"plt.scatter(x, y2, color='green', label='y2')\n",
"\n",
"# Customize the chart\n",
"plt.xlabel('X-axis')\n",
"plt.ylabel('Y-axis')\n",
"plt.title('Scatterplot of y1 and y2')\n",
"plt.legend()\n",
"\n",
"# Display the chart\n",
"plt.show()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 2.2. Exploring Shipping Rates\n",
"\n",
"Company FE would like to explore the trend of shipping rates based on the past 300-day historical data stored in a text file. The text file has one line that contains 300 float numbers separated by commas which represent shipping rates of 300 days. Accurate values of some shipping rates may be missing, for which -1 are placed in the file.\n",
"\n",
"We need to write a program to help company FE explore the shipping rates by computing the mean, max, min, and standard deviation of the shipping rates, and plotting a line chart for the shipping rates.\n",
"\n",
"<font color = 'red'> *Note that there are multiple files of data to be explored. Thus, the text file is obtained from input. For test files, there are rates1.csv, rates2.csv, rates3.csv available in the report folder.* </font>\n",
"\n",
"To fix each missing shipping rate, we will replace it with 1000 if it is a shipping rate of day 1, and with the shipping rate one day before, otherwise."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Task 1: If you use Excel to do this work, what will be the difficulties? (2 points)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<font color='red'>Edit this cell to answer the question above. </font>\n",
"\n",
"Using Excel, it is difficult to clean the data (e.g. -1 means missing data and need to be replaced by 1000)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Task 2: What is the algorithm to automate this work? What are the data structures needed? How are you going to write the code? (5 points)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<font color='red'>Edit this cell to answer the question above. </font>\n",
"\n",
"Algorithm:\n",
" - open and load the csv file\n",
" - extract the data, and fix the missing data\n",
" - print the summary information\n",
" - plot the chart for the data\n",
"\n",
"Data structure:\n",
" - use list to store the fixed data\n",
"\n",
"Code:\n",
" - read the filename, \n",
" - open the csv file\n",
" - store the first row or the file in a list\n",
" - for each index t in the list:\n",
" - convert the t-th value of the list to a float number\n",
" - if t-th value is a negative value (i.e. -1)\n",
" - if t == 0, replace the value by 1000,\n",
" - otherwise, replace the value with previous value in the list\n",
" - use statistic module to print out the summary of the data set\n",
" - use matplotlib.pyplot to plot the line chart"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Task 3: Follow the steps below to write the code:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Step 1: Read the file using CSV module (10 points)\n",
"\n",
"In the code block below, do the followings:\n",
"(1) Import csv module;\n",
"(2) Get the file name from the input\n",
"(3) Create a CSV reader"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['-1', '673.47', '744.41', '704.76', '718.06', '692.85', '704.05', '692.75', '739.4', '732.81', '756.83', '784.98', '807.2', '793.93', '766.57', '775.2', '788.97', '816.21', '767.83', '815.16', '759.82', '774.18', '795.44', '818.97', '775.37', '744.57', '751.59', '790.03', '742.49', '729.3', '704.74', '738.04', '703.2', '680.6', '681.79', '678.32', '643.37', '665.21', '614.56', '631.88', '645.72', '675.76', '625.81', '647.13', '650.58', '-1', '718.31', '724.1', '689.75', '691.71', '709.43', '674.4', '723.07', '745.85', '718.23', '736.67', '698.81', '683.05', '689.15', '678.74', '653.48', '652.63', '691.68', '675.45', '724.02', '737.07', '729.09', '744.02', '779.52', '760.09', '734.86', '754.33', '748.43', '759.47', '715.26', '743.61', '760.48', '752.54', '788.81', '734.52', '698.19', '708.66', '717.19', '735.28', '779.24', '721.97', '756.19', '706.22', '-1', '753.07', '763.02', '799.75', '781.01', '744.33', '714.74', '738.39', '689.59', '685.7', '663.96', '646.07', '644.37', '643.67', '608.42', '620.56', '588.84', '594.6', '603.99', '582.65', '560.5', '528.91', '580.9', '584.57', '553.33', '539.88', '575.78', '579.15', '628.24', '599.92', '624.69', '649.9', '649.61', '688.76', '727.88', '676.53', '660.71', '683.41', '703.47', '744.18', '746.87', '754.21', '780.05', '787.35', '806.55', '811.64', '852.33', '825.46', '876.28', '828.24', '805.17', '793.56', '772.56', '765.29', '773.88', '752.51', '734.59', '702.27', '739.3', '753.83', '743.29', '719.5', '761.43', '723.41', '742.7', '765.92', '786.33', '768.75', '793.86', '750.99', '727.7', '685.47', '669.85', '659.55', '703.59', '653.25', '711.96', '681.98', '677.87', '706.32', '720.86', '743.91', '761.19', '748.94', '769.29', '744.04', '720.32', '719.23', '754.39', '750.76', '752.79', '744.46', '770.7', '803.88', '771.91', '767.94', '719.75', '702.61', '738.57', '771.11', '766.99', '786.74', '788.61', '780.87', '759.66', '735.75', '706.2', '721.92', '676.99', '687.91', '734.19', '676.64', '663.2', '680.68', '703.68', '686.25', '737', '679.4', '659.43', '620.29', '627.75', '661.33', '620.19', '604.92', '629.29', '622.97', '661.69', '654.43', '693.18', '665.88', '710.59', '-1', '706.37', '665.79', '632.08', '660.09', '647.49', '615.36', '602.14', '648.26', '674.7', '651.35', '616.62', '578.14', '571.49', '601.53', '622.59', '586.08', '626.27', '610.34', '631.69', '589.17', '619.81', '645.11', '661.52', '662.13', '686.52', '722.29', '749.14', '709.92', '763.16', '775.74', '813.38', '841.75', '851.32', '838.43', '845.76', '885.49', '882.97', '875.9', '832.18', '792.7', '802.49', '794.12', '840.51', '785.33', '811.42', '792.27', '752.01', '800.42', '783.16', '805.19', '813.51', '841.47', '869.62', '893.71', '850.99', '882.9', '917.84', '890.2', '859.08', '834.28', '813.11', '791.48', '-1', '763.82', '727.2', '709.59', '727.67', '680.91', '706.69', '723.88', '762.86', '743.91', '742.96', '710.89', '724.81', '686.67', '718.21', '704.22', '679.56', '726.62']\n"
]
}
],
"source": [
"#Edit this cell for Step 1:\n",
"#(1) Import csv module;\n",
"import csv\n",
"\n",
"#(2) Get the file name from the input\n",
"file_name = input(\"please input filename of csv file: \")\n",
"# file_name = 'rates3.csv'\n",
"\n",
"#(3) Create a CSV reader\n",
"exampleFile = open(file_name)\n",
"csv_reader = csv.reader(exampleFile)\n",
"\n",
"#(4) Store the first row of the csv file in a list named rates\n",
"rates = list(csv_reader)[0]\n",
"\n",
"#print the rates\n",
"print(rates)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Step 2: Extract Data and Fix Missing Data (6 points)\n",
"\n",
"The next part of the program will loop through each shipping rate in the CVS file, fix the data if needed, and append the data in a list\n",
"\n",
"(0) Define a list variable named rates with an empty initial value;\n",
"(1) write a for loop to traverse each value of the CSV file;\n",
"(2) for each value, if it is negative, then fix the value by replacing it with 1000 if it is a shipping rate of day 1, and with the shipping rate one day before, otherwise.\n",
"(3) Add each (fixed) value to the list rates\n"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1000, 673.47, 744.41, 704.76, 718.06, 692.85, 704.05, 692.75, 739.4, 732.81, 756.83, 784.98, 807.2, 793.93, 766.57, 775.2, 788.97, 816.21, 767.83, 815.16, 759.82, 774.18, 795.44, 818.97, 775.37, 744.57, 751.59, 790.03, 742.49, 729.3, 704.74, 738.04, 703.2, 680.6, 681.79, 678.32, 643.37, 665.21, 614.56, 631.88, 645.72, 675.76, 625.81, 647.13, 650.58, 1000, 718.31, 724.1, 689.75, 691.71, 709.43, 674.4, 723.07, 745.85, 718.23, 736.67, 698.81, 683.05, 689.15, 678.74, 653.48, 652.63, 691.68, 675.45, 724.02, 737.07, 729.09, 744.02, 779.52, 760.09, 734.86, 754.33, 748.43, 759.47, 715.26, 743.61, 760.48, 752.54, 788.81, 734.52, 698.19, 708.66, 717.19, 735.28, 779.24, 721.97, 756.19, 706.22, 1000, 753.07, 763.02, 799.75, 781.01, 744.33, 714.74, 738.39, 689.59, 685.7, 663.96, 646.07, 644.37, 643.67, 608.42, 620.56, 588.84, 594.6, 603.99, 582.65, 560.5, 528.91, 580.9, 584.57, 553.33, 539.88, 575.78, 579.15, 628.24, 599.92, 624.69, 649.9, 649.61, 688.76, 727.88, 676.53, 660.71, 683.41, 703.47, 744.18, 746.87, 754.21, 780.05, 787.35, 806.55, 811.64, 852.33, 825.46, 876.28, 828.24, 805.17, 793.56, 772.56, 765.29, 773.88, 752.51, 734.59, 702.27, 739.3, 753.83, 743.29, 719.5, 761.43, 723.41, 742.7, 765.92, 786.33, 768.75, 793.86, 750.99, 727.7, 685.47, 669.85, 659.55, 703.59, 653.25, 711.96, 681.98, 677.87, 706.32, 720.86, 743.91, 761.19, 748.94, 769.29, 744.04, 720.32, 719.23, 754.39, 750.76, 752.79, 744.46, 770.7, 803.88, 771.91, 767.94, 719.75, 702.61, 738.57, 771.11, 766.99, 786.74, 788.61, 780.87, 759.66, 735.75, 706.2, 721.92, 676.99, 687.91, 734.19, 676.64, 663.2, 680.68, 703.68, 686.25, 737.0, 679.4, 659.43, 620.29, 627.75, 661.33, 620.19, 604.92, 629.29, 622.97, 661.69, 654.43, 693.18, 665.88, 710.59, 1000, 706.37, 665.79, 632.08, 660.09, 647.49, 615.36, 602.14, 648.26, 674.7, 651.35, 616.62, 578.14, 571.49, 601.53, 622.59, 586.08, 626.27, 610.34, 631.69, 589.17, 619.81, 645.11, 661.52, 662.13, 686.52, 722.29, 749.14, 709.92, 763.16, 775.74, 813.38, 841.75, 851.32, 838.43, 845.76, 885.49, 882.97, 875.9, 832.18, 792.7, 802.49, 794.12, 840.51, 785.33, 811.42, 792.27, 752.01, 800.42, 783.16, 805.19, 813.51, 841.47, 869.62, 893.71, 850.99, 882.9, 917.84, 890.2, 859.08, 834.28, 813.11, 791.48, 1000, 763.82, 727.2, 709.59, 727.67, 680.91, 706.69, 723.88, 762.86, 743.91, 742.96, 710.89, 724.81, 686.67, 718.21, 704.22, 679.56, 726.62]\n"
]
}
],
"source": [
"#Edit this cell for Step 2:\n",
"#Write a for loop to traverse each index of the list rates, and for each index t:\n",
"for t in range(0,len(rates)):\n",
" #convert rates[t] to a float number;\n",
" rates[t] = float(rates[t])\n",
" \n",
" if (rates[t] <0):\n",
" #if rates[t] is negative, then fix the value by replacing rates[t] with 1000 \n",
" rates[t] = 1000\n",
" if (rates[t]==0):\n",
" # if t is zero, and replacing rates[t] with ratse[t-1], otherwise.\n",
" rates[t] = rates[t-1]\n",
"\n",
" \n",
"#print the rates\n",
"print(rates)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Step 3: Print Summary Statistics (4 points)\n",
"In the cell below, output mean, standard deviation, max, and min of shipping rates."
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Max: 1000\n",
"Min: 528.91\n",
"Mean: 725.4327\n",
"Standard Deviation: 80.36811707709964\n"
]
}
],
"source": [
"#Edit this cell for Step 3:\n",
"import statistics\n",
"\n",
"# Calculate the max value\n",
"max_rates = max(rates)\n",
"\n",
"# Calculate the min value\n",
"min_rates = min(rates)\n",
"\n",
"# Calculate the mean value\n",
"mean_rates = statistics.mean(rates)\n",
"\n",
"# Calculate the standard deviation\n",
"std_dev_rates = statistics.stdev(rates)\n",
"\n",
"print(f\"Max: {max_rates}\")\n",
"print(f\"Min: {min_rates}\")\n",
"print(f\"Mean: {mean_rates}\")\n",
"print(f\"Standard Deviation: {std_dev_rates}\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Step 4: Plot Line Chart (4 points)\n",
"In the cell below, plot a line chart for the shipping rates with x label as 'day', and y label as 'rate'"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"#Edit this cell for Step 4:\n",
"import matplotlib.pyplot as plt\n",
"\n",
"x = range(len(rates))\n",
"\n",
"plt.plot(x, rates, color='red')\n",
"plt.xlabel('X-axis')\n",
"plt.ylabel('Y-axis')\n",
"plt.title('Line Chart of rates')\n",
"\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Task 4: Test the above codes for rates1.csv, rates2.csv, and rates3.csv (3 points)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<font color='red'>Edit this cell to illustrate the testing results, and paste the screenshot of the line plot for rates1.csv. </font>\n",
"\n",
"![](./images/step4_task4_rate1.png)\n",
"\n",
"```bash\n",
"Max: 1000\n",
"Min: 398.26\n",
"Mean: 652.0710666666666\n",
"Standard Deviation: 99.32730165657784\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<font color='red'>Edit this cell to illustrate the testing results, and paste the screenshot of the line plot for rates2.csv. </font>\n",
"\n",
"![](./images/step4_task4_rate2.png)\n",
"\n",
"\n",
"```bash\n",
"Max: 1076.33\n",
"Min: 498.94\n",
"Mean: 845.1132\n",
"Standard Deviation: 138.54090769194138\n",
"```\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<font color='red'>Edit this cell to illustrate the testing results, and paste the screenshot of the line plot for rates3.csv. </font>\n",
"\n",
"![](./images/step4_task4_rate3.png)\n",
"\n",
"```bash\n",
"Max: 1000\n",
"Min: 528.91\n",
"Mean: 725.4327\n",
"Standard Deviation: 80.36811707709964\n",
"```\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
}
},
"nbformat": 4,
"nbformat_minor": 4
}