7322 lines
233 KiB
Plaintext
7322 lines
233 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "e7d963fc",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Working with Data frame"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "ffb6ef0a",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 1. Pandas Library"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "a080c24c",
|
||
"metadata": {},
|
||
"source": [
|
||
"Pandas library is the one of the most populated used library for manipulating with data. We use the Series and Dataframe data structure extensively as these are much more powerful and useful to manipulate with data when compare with list and dictionary in python.\n",
|
||
"\n",
|
||
"There's another very popular library called Numpy. Pandas bulid on top of it and we usually use pandas directly."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"id": "122da828",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import pandas as pd"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "af8c9270",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 2. Pandas Series"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "7c82119d",
|
||
"metadata": {},
|
||
"source": [
|
||
"A series is very similar to a list. We can easily convert a list to a simple series. A series also has index."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"id": "a9bfcdbb",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"stocks = [\"AAPL\", \"BABA\", \"DIDI\", \"MSFT\", \"AMZN\", \"ADBE\", \"TSLA\", \"MS\", \"V\", \"MA\", \"GS\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"id": "c9026b35",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"stocks_series = pd.Series(stocks)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"id": "66a0bbd3",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 AAPL\n",
|
||
"1 BABA\n",
|
||
"2 DIDI\n",
|
||
"3 MSFT\n",
|
||
"4 AMZN\n",
|
||
"5 ADBE\n",
|
||
"6 TSLA\n",
|
||
"7 MS\n",
|
||
"8 V\n",
|
||
"9 MA\n",
|
||
"10 GS\n",
|
||
"dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 4,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"stocks_series"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "22f5a6b3",
|
||
"metadata": {},
|
||
"source": [
|
||
"Getting the values using index"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"id": "db7b2e69",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"1 BABA\n",
|
||
"2 DIDI\n",
|
||
"dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"stocks_series[1:3]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"id": "31dd2927",
|
||
"metadata": {
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'AAPL'"
|
||
]
|
||
},
|
||
"execution_count": 10,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"stocks_series[0]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 13,
|
||
"id": "935307a6",
|
||
"metadata": {
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"2 DIDI\n",
|
||
"3 MSFT\n",
|
||
"4 AMZN\n",
|
||
"5 ADBE\n",
|
||
"dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 13,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"stocks_series[2:6]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "88c61e98",
|
||
"metadata": {},
|
||
"source": [
|
||
"The difference between list and series is that we can use not use interger as index. Now it looks more like a dictionary. And we can create it from a dictionary"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 14,
|
||
"id": "59d345f2",
|
||
"metadata": {
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"ename": "NameError",
|
||
"evalue": "name 'sales' is not defined",
|
||
"output_type": "error",
|
||
"traceback": [
|
||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
|
||
"Cell \u001b[0;32mIn [14], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m sales_series \u001b[38;5;241m=\u001b[39m pd\u001b[38;5;241m.\u001b[39mSeries(\u001b[43msales\u001b[49m)\n",
|
||
"\u001b[0;31mNameError\u001b[0m: name 'sales' is not defined"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"sales_series = pd.Series(sales)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"id": "64b9ea46",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Central Branch 10000\n",
|
||
"TST Branch 2000\n",
|
||
"Mongkok Branch 3000\n",
|
||
"dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 10,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"sales_series"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "b4e8c74c",
|
||
"metadata": {},
|
||
"source": [
|
||
"Getting the number using index"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 11,
|
||
"id": "7e79920a",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"10000"
|
||
]
|
||
},
|
||
"execution_count": 11,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"sales_series[\"Central Branch\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"id": "cbaa8383",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"sales = {'Central Branch' : 10000,\n",
|
||
" 'TST Branch' : 2000,\n",
|
||
" 'Mongkok Branch' : 3000}"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "388d2ece",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 3 Pandas Dataframe"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "c756f438",
|
||
"metadata": {},
|
||
"source": [
|
||
"You can consider the Series is one column of data on an excel spreadsheet. A dataframe has mulitple series and you can consider that the data of a whole spreadsheet"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "12aa8a6f",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 3.1 Create dataframe from csv"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 15,
|
||
"id": "0a88f1c9",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"ename": "FileNotFoundError",
|
||
"evalue": "[Errno 2] No such file or directory: 'AAPL.csv'",
|
||
"output_type": "error",
|
||
"traceback": [
|
||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||
"\u001b[0;31mFileNotFoundError\u001b[0m Traceback (most recent call last)",
|
||
"Cell \u001b[0;32mIn [15], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m aapl \u001b[38;5;241m=\u001b[39m \u001b[43mpd\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mread_csv\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mAAPL.csv\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n",
|
||
"File \u001b[0;32m~/.local/share/virtualenvs/Note-Vc8kZtnp/lib64/python3.11/site-packages/pandas/util/_decorators.py:211\u001b[0m, in \u001b[0;36mdeprecate_kwarg.<locals>._deprecate_kwarg.<locals>.wrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 209\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 210\u001b[0m kwargs[new_arg_name] \u001b[38;5;241m=\u001b[39m new_arg_value\n\u001b[0;32m--> 211\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n",
|
||
"File \u001b[0;32m~/.local/share/virtualenvs/Note-Vc8kZtnp/lib64/python3.11/site-packages/pandas/util/_decorators.py:331\u001b[0m, in \u001b[0;36mdeprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 325\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(args) \u001b[38;5;241m>\u001b[39m num_allow_args:\n\u001b[1;32m 326\u001b[0m warnings\u001b[38;5;241m.\u001b[39mwarn(\n\u001b[1;32m 327\u001b[0m msg\u001b[38;5;241m.\u001b[39mformat(arguments\u001b[38;5;241m=\u001b[39m_format_argument_list(allow_args)),\n\u001b[1;32m 328\u001b[0m \u001b[38;5;167;01mFutureWarning\u001b[39;00m,\n\u001b[1;32m 329\u001b[0m stacklevel\u001b[38;5;241m=\u001b[39mfind_stack_level(),\n\u001b[1;32m 330\u001b[0m )\n\u001b[0;32m--> 331\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n",
|
||
"File \u001b[0;32m~/.local/share/virtualenvs/Note-Vc8kZtnp/lib64/python3.11/site-packages/pandas/io/parsers/readers.py:950\u001b[0m, in \u001b[0;36mread_csv\u001b[0;34m(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, error_bad_lines, warn_bad_lines, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options)\u001b[0m\n\u001b[1;32m 935\u001b[0m kwds_defaults \u001b[38;5;241m=\u001b[39m _refine_defaults_read(\n\u001b[1;32m 936\u001b[0m dialect,\n\u001b[1;32m 937\u001b[0m delimiter,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 946\u001b[0m defaults\u001b[38;5;241m=\u001b[39m{\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mdelimiter\u001b[39m\u001b[38;5;124m\"\u001b[39m: \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m,\u001b[39m\u001b[38;5;124m\"\u001b[39m},\n\u001b[1;32m 947\u001b[0m )\n\u001b[1;32m 948\u001b[0m kwds\u001b[38;5;241m.\u001b[39mupdate(kwds_defaults)\n\u001b[0;32m--> 950\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43m_read\u001b[49m\u001b[43m(\u001b[49m\u001b[43mfilepath_or_buffer\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mkwds\u001b[49m\u001b[43m)\u001b[49m\n",
|
||
"File \u001b[0;32m~/.local/share/virtualenvs/Note-Vc8kZtnp/lib64/python3.11/site-packages/pandas/io/parsers/readers.py:605\u001b[0m, in \u001b[0;36m_read\u001b[0;34m(filepath_or_buffer, kwds)\u001b[0m\n\u001b[1;32m 602\u001b[0m _validate_names(kwds\u001b[38;5;241m.\u001b[39mget(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mnames\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;28;01mNone\u001b[39;00m))\n\u001b[1;32m 604\u001b[0m \u001b[38;5;66;03m# Create the parser.\u001b[39;00m\n\u001b[0;32m--> 605\u001b[0m parser \u001b[38;5;241m=\u001b[39m \u001b[43mTextFileReader\u001b[49m\u001b[43m(\u001b[49m\u001b[43mfilepath_or_buffer\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwds\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 607\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m chunksize \u001b[38;5;129;01mor\u001b[39;00m iterator:\n\u001b[1;32m 608\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m parser\n",
|
||
"File \u001b[0;32m~/.local/share/virtualenvs/Note-Vc8kZtnp/lib64/python3.11/site-packages/pandas/io/parsers/readers.py:1442\u001b[0m, in \u001b[0;36mTextFileReader.__init__\u001b[0;34m(self, f, engine, **kwds)\u001b[0m\n\u001b[1;32m 1439\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39moptions[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mhas_index_names\u001b[39m\u001b[38;5;124m\"\u001b[39m] \u001b[38;5;241m=\u001b[39m kwds[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mhas_index_names\u001b[39m\u001b[38;5;124m\"\u001b[39m]\n\u001b[1;32m 1441\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mhandles: IOHandles \u001b[38;5;241m|\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[0;32m-> 1442\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_engine \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_make_engine\u001b[49m\u001b[43m(\u001b[49m\u001b[43mf\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mengine\u001b[49m\u001b[43m)\u001b[49m\n",
|
||
"File \u001b[0;32m~/.local/share/virtualenvs/Note-Vc8kZtnp/lib64/python3.11/site-packages/pandas/io/parsers/readers.py:1735\u001b[0m, in \u001b[0;36mTextFileReader._make_engine\u001b[0;34m(self, f, engine)\u001b[0m\n\u001b[1;32m 1733\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mb\u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;129;01min\u001b[39;00m mode:\n\u001b[1;32m 1734\u001b[0m mode \u001b[38;5;241m+\u001b[39m\u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mb\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m-> 1735\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mhandles \u001b[38;5;241m=\u001b[39m \u001b[43mget_handle\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 1736\u001b[0m \u001b[43m \u001b[49m\u001b[43mf\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1737\u001b[0m \u001b[43m \u001b[49m\u001b[43mmode\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1738\u001b[0m \u001b[43m \u001b[49m\u001b[43mencoding\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moptions\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mget\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mencoding\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mNone\u001b[39;49;00m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1739\u001b[0m \u001b[43m \u001b[49m\u001b[43mcompression\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moptions\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mget\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mcompression\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mNone\u001b[39;49;00m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1740\u001b[0m \u001b[43m \u001b[49m\u001b[43mmemory_map\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moptions\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mget\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mmemory_map\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mFalse\u001b[39;49;00m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1741\u001b[0m \u001b[43m \u001b[49m\u001b[43mis_text\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mis_text\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1742\u001b[0m \u001b[43m \u001b[49m\u001b[43merrors\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moptions\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mget\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mencoding_errors\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mstrict\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1743\u001b[0m \u001b[43m \u001b[49m\u001b[43mstorage_options\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moptions\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mget\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mstorage_options\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mNone\u001b[39;49;00m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1744\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1745\u001b[0m \u001b[38;5;28;01massert\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mhandles \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[1;32m 1746\u001b[0m f \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mhandles\u001b[38;5;241m.\u001b[39mhandle\n",
|
||
"File \u001b[0;32m~/.local/share/virtualenvs/Note-Vc8kZtnp/lib64/python3.11/site-packages/pandas/io/common.py:856\u001b[0m, in \u001b[0;36mget_handle\u001b[0;34m(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)\u001b[0m\n\u001b[1;32m 851\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(handle, \u001b[38;5;28mstr\u001b[39m):\n\u001b[1;32m 852\u001b[0m \u001b[38;5;66;03m# Check whether the filename is to be opened in binary mode.\u001b[39;00m\n\u001b[1;32m 853\u001b[0m \u001b[38;5;66;03m# Binary mode does not support 'encoding' and 'newline'.\u001b[39;00m\n\u001b[1;32m 854\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m ioargs\u001b[38;5;241m.\u001b[39mencoding \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mb\u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;129;01min\u001b[39;00m ioargs\u001b[38;5;241m.\u001b[39mmode:\n\u001b[1;32m 855\u001b[0m \u001b[38;5;66;03m# Encoding\u001b[39;00m\n\u001b[0;32m--> 856\u001b[0m handle \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mopen\u001b[39;49m\u001b[43m(\u001b[49m\n\u001b[1;32m 857\u001b[0m \u001b[43m \u001b[49m\u001b[43mhandle\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 858\u001b[0m \u001b[43m \u001b[49m\u001b[43mioargs\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmode\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 859\u001b[0m \u001b[43m \u001b[49m\u001b[43mencoding\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mioargs\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mencoding\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 860\u001b[0m \u001b[43m \u001b[49m\u001b[43merrors\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43merrors\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 861\u001b[0m \u001b[43m \u001b[49m\u001b[43mnewline\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 862\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 863\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 864\u001b[0m \u001b[38;5;66;03m# Binary mode\u001b[39;00m\n\u001b[1;32m 865\u001b[0m handle \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mopen\u001b[39m(handle, ioargs\u001b[38;5;241m.\u001b[39mmode)\n",
|
||
"\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: 'AAPL.csv'"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl = pd.read_csv(\"AAPL.csv\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 16,
|
||
"id": "20064351",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"ename": "NameError",
|
||
"evalue": "name 'aapl' is not defined",
|
||
"output_type": "error",
|
||
"traceback": [
|
||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
|
||
"Cell \u001b[0;32mIn [16], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43maapl\u001b[49m\n",
|
||
"\u001b[0;31mNameError\u001b[0m: name 'aapl' is not defined"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 14,
|
||
"id": "a6ae2efc",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"aapl_proper_index = pd.read_csv(\"AAPL.csv\", parse_dates=True, index_col='Date')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 15,
|
||
"id": "92625a87",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Open</th>\n",
|
||
" <th>High</th>\n",
|
||
" <th>Low</th>\n",
|
||
" <th>Close</th>\n",
|
||
" <th>Adj Close</th>\n",
|
||
" <th>Volume</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2019-10-28</th>\n",
|
||
" <td>61.855000</td>\n",
|
||
" <td>62.312500</td>\n",
|
||
" <td>61.680000</td>\n",
|
||
" <td>62.262501</td>\n",
|
||
" <td>61.650810</td>\n",
|
||
" <td>96572800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-10-29</th>\n",
|
||
" <td>62.242500</td>\n",
|
||
" <td>62.437500</td>\n",
|
||
" <td>60.642502</td>\n",
|
||
" <td>60.822498</td>\n",
|
||
" <td>60.224953</td>\n",
|
||
" <td>142839600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-10-30</th>\n",
|
||
" <td>61.189999</td>\n",
|
||
" <td>61.325001</td>\n",
|
||
" <td>60.302502</td>\n",
|
||
" <td>60.814999</td>\n",
|
||
" <td>60.217525</td>\n",
|
||
" <td>124522000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-10-31</th>\n",
|
||
" <td>61.810001</td>\n",
|
||
" <td>62.292500</td>\n",
|
||
" <td>59.314999</td>\n",
|
||
" <td>62.189999</td>\n",
|
||
" <td>61.579021</td>\n",
|
||
" <td>139162000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-01</th>\n",
|
||
" <td>62.384998</td>\n",
|
||
" <td>63.982498</td>\n",
|
||
" <td>62.290001</td>\n",
|
||
" <td>63.955002</td>\n",
|
||
" <td>63.326683</td>\n",
|
||
" <td>151125200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-21</th>\n",
|
||
" <td>116.669998</td>\n",
|
||
" <td>118.709999</td>\n",
|
||
" <td>116.449997</td>\n",
|
||
" <td>116.870003</td>\n",
|
||
" <td>116.870003</td>\n",
|
||
" <td>89946000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-22</th>\n",
|
||
" <td>117.449997</td>\n",
|
||
" <td>118.040001</td>\n",
|
||
" <td>114.589996</td>\n",
|
||
" <td>115.750000</td>\n",
|
||
" <td>115.750000</td>\n",
|
||
" <td>101988000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-23</th>\n",
|
||
" <td>116.389999</td>\n",
|
||
" <td>116.550003</td>\n",
|
||
" <td>114.279999</td>\n",
|
||
" <td>115.040001</td>\n",
|
||
" <td>115.040001</td>\n",
|
||
" <td>82572600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-26</th>\n",
|
||
" <td>114.010002</td>\n",
|
||
" <td>116.550003</td>\n",
|
||
" <td>112.879997</td>\n",
|
||
" <td>115.050003</td>\n",
|
||
" <td>115.050003</td>\n",
|
||
" <td>111850700</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-27</th>\n",
|
||
" <td>115.489998</td>\n",
|
||
" <td>117.279999</td>\n",
|
||
" <td>114.540001</td>\n",
|
||
" <td>116.599998</td>\n",
|
||
" <td>116.599998</td>\n",
|
||
" <td>91927700</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>253 rows × 6 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Open High Low Close Adj Close \\\n",
|
||
"Date \n",
|
||
"2019-10-28 61.855000 62.312500 61.680000 62.262501 61.650810 \n",
|
||
"2019-10-29 62.242500 62.437500 60.642502 60.822498 60.224953 \n",
|
||
"2019-10-30 61.189999 61.325001 60.302502 60.814999 60.217525 \n",
|
||
"2019-10-31 61.810001 62.292500 59.314999 62.189999 61.579021 \n",
|
||
"2019-11-01 62.384998 63.982498 62.290001 63.955002 63.326683 \n",
|
||
"... ... ... ... ... ... \n",
|
||
"2020-10-21 116.669998 118.709999 116.449997 116.870003 116.870003 \n",
|
||
"2020-10-22 117.449997 118.040001 114.589996 115.750000 115.750000 \n",
|
||
"2020-10-23 116.389999 116.550003 114.279999 115.040001 115.040001 \n",
|
||
"2020-10-26 114.010002 116.550003 112.879997 115.050003 115.050003 \n",
|
||
"2020-10-27 115.489998 117.279999 114.540001 116.599998 116.599998 \n",
|
||
"\n",
|
||
" Volume \n",
|
||
"Date \n",
|
||
"2019-10-28 96572800 \n",
|
||
"2019-10-29 142839600 \n",
|
||
"2019-10-30 124522000 \n",
|
||
"2019-10-31 139162000 \n",
|
||
"2019-11-01 151125200 \n",
|
||
"... ... \n",
|
||
"2020-10-21 89946000 \n",
|
||
"2020-10-22 101988000 \n",
|
||
"2020-10-23 82572600 \n",
|
||
"2020-10-26 111850700 \n",
|
||
"2020-10-27 91927700 \n",
|
||
"\n",
|
||
"[253 rows x 6 columns]"
|
||
]
|
||
},
|
||
"execution_count": 15,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "7a3a1988",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 3.2 From Quandl"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 18,
|
||
"id": "da0ee6dc",
|
||
"metadata": {
|
||
"scrolled": false
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"import quandl"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 19,
|
||
"id": "3d0ac03e",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"quandl.ApiConfig.api_key = 'x9M_pZutNNPnha1WDdjZ'\n",
|
||
"ck = quandl.get('HKEX/00001', start_date='2020-10-20', end_date='2021-10-20')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 20,
|
||
"id": "ebda1d4c",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Nominal Price</th>\n",
|
||
" <th>Net Change</th>\n",
|
||
" <th>Change (%)</th>\n",
|
||
" <th>Bid</th>\n",
|
||
" <th>Ask</th>\n",
|
||
" <th>P/E(x)</th>\n",
|
||
" <th>High</th>\n",
|
||
" <th>Low</th>\n",
|
||
" <th>Previous Close</th>\n",
|
||
" <th>Share Volume (000)</th>\n",
|
||
" <th>Turnover (000)</th>\n",
|
||
" <th>Lot Size</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-20</th>\n",
|
||
" <td>46.05</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>46.05</td>\n",
|
||
" <td>46.10</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>46.35</td>\n",
|
||
" <td>45.85</td>\n",
|
||
" <td>46.35</td>\n",
|
||
" <td>4193.0</td>\n",
|
||
" <td>192921.0</td>\n",
|
||
" <td>None</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-21</th>\n",
|
||
" <td>46.15</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>46.15</td>\n",
|
||
" <td>46.20</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>46.50</td>\n",
|
||
" <td>45.95</td>\n",
|
||
" <td>46.05</td>\n",
|
||
" <td>4830.0</td>\n",
|
||
" <td>223077.0</td>\n",
|
||
" <td>None</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-22</th>\n",
|
||
" <td>46.10</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>46.10</td>\n",
|
||
" <td>46.15</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>46.35</td>\n",
|
||
" <td>45.90</td>\n",
|
||
" <td>46.15</td>\n",
|
||
" <td>4902.0</td>\n",
|
||
" <td>226000.0</td>\n",
|
||
" <td>None</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-23</th>\n",
|
||
" <td>46.40</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>46.40</td>\n",
|
||
" <td>46.45</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>46.50</td>\n",
|
||
" <td>45.80</td>\n",
|
||
" <td>46.10</td>\n",
|
||
" <td>3815.0</td>\n",
|
||
" <td>176451.0</td>\n",
|
||
" <td>None</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-27</th>\n",
|
||
" <td>46.75</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>46.75</td>\n",
|
||
" <td>46.80</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>47.15</td>\n",
|
||
" <td>46.50</td>\n",
|
||
" <td>46.40</td>\n",
|
||
" <td>12095.0</td>\n",
|
||
" <td>566845.0</td>\n",
|
||
" <td>None</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2021-10-12</th>\n",
|
||
" <td>52.60</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>52.60</td>\n",
|
||
" <td>52.65</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>53.10</td>\n",
|
||
" <td>52.35</td>\n",
|
||
" <td>52.95</td>\n",
|
||
" <td>2712.0</td>\n",
|
||
" <td>142802.0</td>\n",
|
||
" <td>None</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2021-10-15</th>\n",
|
||
" <td>52.50</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>52.50</td>\n",
|
||
" <td>52.55</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>52.95</td>\n",
|
||
" <td>52.00</td>\n",
|
||
" <td>52.60</td>\n",
|
||
" <td>5067.0</td>\n",
|
||
" <td>266129.0</td>\n",
|
||
" <td>None</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2021-10-18</th>\n",
|
||
" <td>52.70</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>52.70</td>\n",
|
||
" <td>52.75</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>53.00</td>\n",
|
||
" <td>52.30</td>\n",
|
||
" <td>52.50</td>\n",
|
||
" <td>4036.0</td>\n",
|
||
" <td>212393.0</td>\n",
|
||
" <td>None</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2021-10-19</th>\n",
|
||
" <td>53.30</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>53.25</td>\n",
|
||
" <td>53.30</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>53.50</td>\n",
|
||
" <td>52.95</td>\n",
|
||
" <td>52.70</td>\n",
|
||
" <td>2484.0</td>\n",
|
||
" <td>132423.0</td>\n",
|
||
" <td>None</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2021-10-20</th>\n",
|
||
" <td>53.25</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>53.20</td>\n",
|
||
" <td>53.25</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>53.40</td>\n",
|
||
" <td>52.80</td>\n",
|
||
" <td>53.30</td>\n",
|
||
" <td>3649.0</td>\n",
|
||
" <td>193972.0</td>\n",
|
||
" <td>None</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>247 rows × 12 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Nominal Price Net Change Change (%) Bid Ask P/E(x) High \\\n",
|
||
"Date \n",
|
||
"2020-10-20 46.05 None None 46.05 46.10 None 46.35 \n",
|
||
"2020-10-21 46.15 None None 46.15 46.20 None 46.50 \n",
|
||
"2020-10-22 46.10 None None 46.10 46.15 None 46.35 \n",
|
||
"2020-10-23 46.40 None None 46.40 46.45 None 46.50 \n",
|
||
"2020-10-27 46.75 None None 46.75 46.80 None 47.15 \n",
|
||
"... ... ... ... ... ... ... ... \n",
|
||
"2021-10-12 52.60 None None 52.60 52.65 None 53.10 \n",
|
||
"2021-10-15 52.50 None None 52.50 52.55 None 52.95 \n",
|
||
"2021-10-18 52.70 None None 52.70 52.75 None 53.00 \n",
|
||
"2021-10-19 53.30 None None 53.25 53.30 None 53.50 \n",
|
||
"2021-10-20 53.25 None None 53.20 53.25 None 53.40 \n",
|
||
"\n",
|
||
" Low Previous Close Share Volume (000) Turnover (000) Lot Size \n",
|
||
"Date \n",
|
||
"2020-10-20 45.85 46.35 4193.0 192921.0 None \n",
|
||
"2020-10-21 45.95 46.05 4830.0 223077.0 None \n",
|
||
"2020-10-22 45.90 46.15 4902.0 226000.0 None \n",
|
||
"2020-10-23 45.80 46.10 3815.0 176451.0 None \n",
|
||
"2020-10-27 46.50 46.40 12095.0 566845.0 None \n",
|
||
"... ... ... ... ... ... \n",
|
||
"2021-10-12 52.35 52.95 2712.0 142802.0 None \n",
|
||
"2021-10-15 52.00 52.60 5067.0 266129.0 None \n",
|
||
"2021-10-18 52.30 52.50 4036.0 212393.0 None \n",
|
||
"2021-10-19 52.95 52.70 2484.0 132423.0 None \n",
|
||
"2021-10-20 52.80 53.30 3649.0 193972.0 None \n",
|
||
"\n",
|
||
"[247 rows x 12 columns]"
|
||
]
|
||
},
|
||
"execution_count": 20,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"ck"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "db245a3b",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 3.3 From Series"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 21,
|
||
"id": "c389d843",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"costs = {'Central Branch' : 300000,\n",
|
||
" 'TST Branch' : 50000,\n",
|
||
" 'Mongkok Branch' : 20000}"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 22,
|
||
"id": "b6936c74",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"ename": "NameError",
|
||
"evalue": "name 'sales' is not defined",
|
||
"output_type": "error",
|
||
"traceback": [
|
||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
|
||
"Cell \u001b[0;32mIn [22], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m branch_summary \u001b[38;5;241m=\u001b[39m pd\u001b[38;5;241m.\u001b[39mDataFrame({\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124msales\u001b[39m\u001b[38;5;124m\"\u001b[39m: \u001b[43msales\u001b[49m, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mcosts\u001b[39m\u001b[38;5;124m\"\u001b[39m: costs})\n",
|
||
"\u001b[0;31mNameError\u001b[0m: name 'sales' is not defined"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"branch_summary = pd.DataFrame({\"sales\": sales, \"costs\": costs})"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 21,
|
||
"id": "9107a2e3",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>sales</th>\n",
|
||
" <th>costs</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>Central Branch</th>\n",
|
||
" <td>10000</td>\n",
|
||
" <td>300000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>TST Branch</th>\n",
|
||
" <td>2000</td>\n",
|
||
" <td>50000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Mongkok Branch</th>\n",
|
||
" <td>3000</td>\n",
|
||
" <td>20000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" sales costs\n",
|
||
"Central Branch 10000 300000\n",
|
||
"TST Branch 2000 50000\n",
|
||
"Mongkok Branch 3000 20000"
|
||
]
|
||
},
|
||
"execution_count": 21,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"branch_summary"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "5bab7a42",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 3.4 Getting data from dataframe (getting rows with date)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 23,
|
||
"id": "6a84288d",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"ename": "NameError",
|
||
"evalue": "name 'aapl_proper_index' is not defined",
|
||
"output_type": "error",
|
||
"traceback": [
|
||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
|
||
"Cell \u001b[0;32mIn [23], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43maapl_proper_index\u001b[49m\u001b[38;5;241m.\u001b[39mloc[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m2019-10-30\u001b[39m\u001b[38;5;124m\"\u001b[39m]\n",
|
||
"\u001b[0;31mNameError\u001b[0m: name 'aapl_proper_index' is not defined"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index.loc[\"2019-10-30\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 23,
|
||
"id": "60c748c1",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Open</th>\n",
|
||
" <th>High</th>\n",
|
||
" <th>Low</th>\n",
|
||
" <th>Close</th>\n",
|
||
" <th>Adj Close</th>\n",
|
||
" <th>Volume</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2019-10-30</th>\n",
|
||
" <td>61.189999</td>\n",
|
||
" <td>61.325001</td>\n",
|
||
" <td>60.302502</td>\n",
|
||
" <td>60.814999</td>\n",
|
||
" <td>60.217525</td>\n",
|
||
" <td>124522000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-10-31</th>\n",
|
||
" <td>61.810001</td>\n",
|
||
" <td>62.292500</td>\n",
|
||
" <td>59.314999</td>\n",
|
||
" <td>62.189999</td>\n",
|
||
" <td>61.579021</td>\n",
|
||
" <td>139162000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-01</th>\n",
|
||
" <td>62.384998</td>\n",
|
||
" <td>63.982498</td>\n",
|
||
" <td>62.290001</td>\n",
|
||
" <td>63.955002</td>\n",
|
||
" <td>63.326683</td>\n",
|
||
" <td>151125200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-04</th>\n",
|
||
" <td>64.332497</td>\n",
|
||
" <td>64.462502</td>\n",
|
||
" <td>63.845001</td>\n",
|
||
" <td>64.375000</td>\n",
|
||
" <td>63.742554</td>\n",
|
||
" <td>103272000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-05</th>\n",
|
||
" <td>64.262497</td>\n",
|
||
" <td>64.547501</td>\n",
|
||
" <td>64.080002</td>\n",
|
||
" <td>64.282501</td>\n",
|
||
" <td>63.650970</td>\n",
|
||
" <td>79897600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-06</th>\n",
|
||
" <td>64.192497</td>\n",
|
||
" <td>64.372498</td>\n",
|
||
" <td>63.842499</td>\n",
|
||
" <td>64.309998</td>\n",
|
||
" <td>63.678192</td>\n",
|
||
" <td>75864400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-07</th>\n",
|
||
" <td>64.684998</td>\n",
|
||
" <td>65.087502</td>\n",
|
||
" <td>64.527496</td>\n",
|
||
" <td>64.857498</td>\n",
|
||
" <td>64.413116</td>\n",
|
||
" <td>94940400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-08</th>\n",
|
||
" <td>64.672501</td>\n",
|
||
" <td>65.110001</td>\n",
|
||
" <td>64.212502</td>\n",
|
||
" <td>65.035004</td>\n",
|
||
" <td>64.589409</td>\n",
|
||
" <td>69986400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-11</th>\n",
|
||
" <td>64.574997</td>\n",
|
||
" <td>65.617500</td>\n",
|
||
" <td>64.570000</td>\n",
|
||
" <td>65.550003</td>\n",
|
||
" <td>65.100876</td>\n",
|
||
" <td>81821200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-12</th>\n",
|
||
" <td>65.387497</td>\n",
|
||
" <td>65.697502</td>\n",
|
||
" <td>65.230003</td>\n",
|
||
" <td>65.489998</td>\n",
|
||
" <td>65.041283</td>\n",
|
||
" <td>87388800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-13</th>\n",
|
||
" <td>65.282501</td>\n",
|
||
" <td>66.195000</td>\n",
|
||
" <td>65.267502</td>\n",
|
||
" <td>66.117500</td>\n",
|
||
" <td>65.664490</td>\n",
|
||
" <td>102734400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-14</th>\n",
|
||
" <td>65.937500</td>\n",
|
||
" <td>66.220001</td>\n",
|
||
" <td>65.525002</td>\n",
|
||
" <td>65.660004</td>\n",
|
||
" <td>65.210121</td>\n",
|
||
" <td>89182800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-15</th>\n",
|
||
" <td>65.919998</td>\n",
|
||
" <td>66.445000</td>\n",
|
||
" <td>65.752502</td>\n",
|
||
" <td>66.440002</td>\n",
|
||
" <td>65.984779</td>\n",
|
||
" <td>100206400</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Open High Low Close Adj Close Volume\n",
|
||
"Date \n",
|
||
"2019-10-30 61.189999 61.325001 60.302502 60.814999 60.217525 124522000\n",
|
||
"2019-10-31 61.810001 62.292500 59.314999 62.189999 61.579021 139162000\n",
|
||
"2019-11-01 62.384998 63.982498 62.290001 63.955002 63.326683 151125200\n",
|
||
"2019-11-04 64.332497 64.462502 63.845001 64.375000 63.742554 103272000\n",
|
||
"2019-11-05 64.262497 64.547501 64.080002 64.282501 63.650970 79897600\n",
|
||
"2019-11-06 64.192497 64.372498 63.842499 64.309998 63.678192 75864400\n",
|
||
"2019-11-07 64.684998 65.087502 64.527496 64.857498 64.413116 94940400\n",
|
||
"2019-11-08 64.672501 65.110001 64.212502 65.035004 64.589409 69986400\n",
|
||
"2019-11-11 64.574997 65.617500 64.570000 65.550003 65.100876 81821200\n",
|
||
"2019-11-12 65.387497 65.697502 65.230003 65.489998 65.041283 87388800\n",
|
||
"2019-11-13 65.282501 66.195000 65.267502 66.117500 65.664490 102734400\n",
|
||
"2019-11-14 65.937500 66.220001 65.525002 65.660004 65.210121 89182800\n",
|
||
"2019-11-15 65.919998 66.445000 65.752502 66.440002 65.984779 100206400"
|
||
]
|
||
},
|
||
"execution_count": 23,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index.loc[\"2019-10-30\":\"2019-11-15\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 24,
|
||
"id": "b0c747d8",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Open</th>\n",
|
||
" <th>High</th>\n",
|
||
" <th>Low</th>\n",
|
||
" <th>Close</th>\n",
|
||
" <th>Adj Close</th>\n",
|
||
" <th>Volume</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-01</th>\n",
|
||
" <td>62.384998</td>\n",
|
||
" <td>63.982498</td>\n",
|
||
" <td>62.290001</td>\n",
|
||
" <td>63.955002</td>\n",
|
||
" <td>63.326683</td>\n",
|
||
" <td>151125200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-04</th>\n",
|
||
" <td>64.332497</td>\n",
|
||
" <td>64.462502</td>\n",
|
||
" <td>63.845001</td>\n",
|
||
" <td>64.375000</td>\n",
|
||
" <td>63.742554</td>\n",
|
||
" <td>103272000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-05</th>\n",
|
||
" <td>64.262497</td>\n",
|
||
" <td>64.547501</td>\n",
|
||
" <td>64.080002</td>\n",
|
||
" <td>64.282501</td>\n",
|
||
" <td>63.650970</td>\n",
|
||
" <td>79897600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-06</th>\n",
|
||
" <td>64.192497</td>\n",
|
||
" <td>64.372498</td>\n",
|
||
" <td>63.842499</td>\n",
|
||
" <td>64.309998</td>\n",
|
||
" <td>63.678192</td>\n",
|
||
" <td>75864400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-07</th>\n",
|
||
" <td>64.684998</td>\n",
|
||
" <td>65.087502</td>\n",
|
||
" <td>64.527496</td>\n",
|
||
" <td>64.857498</td>\n",
|
||
" <td>64.413116</td>\n",
|
||
" <td>94940400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-08</th>\n",
|
||
" <td>64.672501</td>\n",
|
||
" <td>65.110001</td>\n",
|
||
" <td>64.212502</td>\n",
|
||
" <td>65.035004</td>\n",
|
||
" <td>64.589409</td>\n",
|
||
" <td>69986400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-11</th>\n",
|
||
" <td>64.574997</td>\n",
|
||
" <td>65.617500</td>\n",
|
||
" <td>64.570000</td>\n",
|
||
" <td>65.550003</td>\n",
|
||
" <td>65.100876</td>\n",
|
||
" <td>81821200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-12</th>\n",
|
||
" <td>65.387497</td>\n",
|
||
" <td>65.697502</td>\n",
|
||
" <td>65.230003</td>\n",
|
||
" <td>65.489998</td>\n",
|
||
" <td>65.041283</td>\n",
|
||
" <td>87388800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-13</th>\n",
|
||
" <td>65.282501</td>\n",
|
||
" <td>66.195000</td>\n",
|
||
" <td>65.267502</td>\n",
|
||
" <td>66.117500</td>\n",
|
||
" <td>65.664490</td>\n",
|
||
" <td>102734400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-14</th>\n",
|
||
" <td>65.937500</td>\n",
|
||
" <td>66.220001</td>\n",
|
||
" <td>65.525002</td>\n",
|
||
" <td>65.660004</td>\n",
|
||
" <td>65.210121</td>\n",
|
||
" <td>89182800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-15</th>\n",
|
||
" <td>65.919998</td>\n",
|
||
" <td>66.445000</td>\n",
|
||
" <td>65.752502</td>\n",
|
||
" <td>66.440002</td>\n",
|
||
" <td>65.984779</td>\n",
|
||
" <td>100206400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-18</th>\n",
|
||
" <td>66.449997</td>\n",
|
||
" <td>66.857498</td>\n",
|
||
" <td>66.057503</td>\n",
|
||
" <td>66.775002</td>\n",
|
||
" <td>66.317490</td>\n",
|
||
" <td>86703200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-19</th>\n",
|
||
" <td>66.974998</td>\n",
|
||
" <td>67.000000</td>\n",
|
||
" <td>66.347504</td>\n",
|
||
" <td>66.572502</td>\n",
|
||
" <td>66.116371</td>\n",
|
||
" <td>76167200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-20</th>\n",
|
||
" <td>66.385002</td>\n",
|
||
" <td>66.519997</td>\n",
|
||
" <td>65.099998</td>\n",
|
||
" <td>65.797501</td>\n",
|
||
" <td>65.346687</td>\n",
|
||
" <td>106234400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-21</th>\n",
|
||
" <td>65.922501</td>\n",
|
||
" <td>66.002502</td>\n",
|
||
" <td>65.294998</td>\n",
|
||
" <td>65.502502</td>\n",
|
||
" <td>65.053703</td>\n",
|
||
" <td>121395200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-22</th>\n",
|
||
" <td>65.647499</td>\n",
|
||
" <td>65.794998</td>\n",
|
||
" <td>65.209999</td>\n",
|
||
" <td>65.445000</td>\n",
|
||
" <td>64.996597</td>\n",
|
||
" <td>65325200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-25</th>\n",
|
||
" <td>65.677498</td>\n",
|
||
" <td>66.610001</td>\n",
|
||
" <td>65.629997</td>\n",
|
||
" <td>66.592499</td>\n",
|
||
" <td>66.136230</td>\n",
|
||
" <td>84020400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-26</th>\n",
|
||
" <td>66.735001</td>\n",
|
||
" <td>66.790001</td>\n",
|
||
" <td>65.625000</td>\n",
|
||
" <td>66.072502</td>\n",
|
||
" <td>65.619789</td>\n",
|
||
" <td>105207600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-27</th>\n",
|
||
" <td>66.394997</td>\n",
|
||
" <td>66.995003</td>\n",
|
||
" <td>66.327499</td>\n",
|
||
" <td>66.959999</td>\n",
|
||
" <td>66.501213</td>\n",
|
||
" <td>65235600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-29</th>\n",
|
||
" <td>66.650002</td>\n",
|
||
" <td>67.000000</td>\n",
|
||
" <td>66.474998</td>\n",
|
||
" <td>66.812500</td>\n",
|
||
" <td>66.354729</td>\n",
|
||
" <td>46617600</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Open High Low Close Adj Close Volume\n",
|
||
"Date \n",
|
||
"2019-11-01 62.384998 63.982498 62.290001 63.955002 63.326683 151125200\n",
|
||
"2019-11-04 64.332497 64.462502 63.845001 64.375000 63.742554 103272000\n",
|
||
"2019-11-05 64.262497 64.547501 64.080002 64.282501 63.650970 79897600\n",
|
||
"2019-11-06 64.192497 64.372498 63.842499 64.309998 63.678192 75864400\n",
|
||
"2019-11-07 64.684998 65.087502 64.527496 64.857498 64.413116 94940400\n",
|
||
"2019-11-08 64.672501 65.110001 64.212502 65.035004 64.589409 69986400\n",
|
||
"2019-11-11 64.574997 65.617500 64.570000 65.550003 65.100876 81821200\n",
|
||
"2019-11-12 65.387497 65.697502 65.230003 65.489998 65.041283 87388800\n",
|
||
"2019-11-13 65.282501 66.195000 65.267502 66.117500 65.664490 102734400\n",
|
||
"2019-11-14 65.937500 66.220001 65.525002 65.660004 65.210121 89182800\n",
|
||
"2019-11-15 65.919998 66.445000 65.752502 66.440002 65.984779 100206400\n",
|
||
"2019-11-18 66.449997 66.857498 66.057503 66.775002 66.317490 86703200\n",
|
||
"2019-11-19 66.974998 67.000000 66.347504 66.572502 66.116371 76167200\n",
|
||
"2019-11-20 66.385002 66.519997 65.099998 65.797501 65.346687 106234400\n",
|
||
"2019-11-21 65.922501 66.002502 65.294998 65.502502 65.053703 121395200\n",
|
||
"2019-11-22 65.647499 65.794998 65.209999 65.445000 64.996597 65325200\n",
|
||
"2019-11-25 65.677498 66.610001 65.629997 66.592499 66.136230 84020400\n",
|
||
"2019-11-26 66.735001 66.790001 65.625000 66.072502 65.619789 105207600\n",
|
||
"2019-11-27 66.394997 66.995003 66.327499 66.959999 66.501213 65235600\n",
|
||
"2019-11-29 66.650002 67.000000 66.474998 66.812500 66.354729 46617600"
|
||
]
|
||
},
|
||
"execution_count": 24,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index.loc[\"2019-11\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "de65e36f",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 3.5 Getting data from dataframe (get a series)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 25,
|
||
"id": "ff87318b",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Date\n",
|
||
"2019-11-01 63.955002\n",
|
||
"2019-11-04 64.375000\n",
|
||
"2019-11-05 64.282501\n",
|
||
"2019-11-06 64.309998\n",
|
||
"2019-11-07 64.857498\n",
|
||
"2019-11-08 65.035004\n",
|
||
"2019-11-11 65.550003\n",
|
||
"2019-11-12 65.489998\n",
|
||
"2019-11-13 66.117500\n",
|
||
"2019-11-14 65.660004\n",
|
||
"2019-11-15 66.440002\n",
|
||
"2019-11-18 66.775002\n",
|
||
"2019-11-19 66.572502\n",
|
||
"2019-11-20 65.797501\n",
|
||
"2019-11-21 65.502502\n",
|
||
"2019-11-22 65.445000\n",
|
||
"2019-11-25 66.592499\n",
|
||
"2019-11-26 66.072502\n",
|
||
"2019-11-27 66.959999\n",
|
||
"2019-11-29 66.812500\n",
|
||
"Name: Close, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 25,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index.loc[\"2019-11\"][\"Close\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "1b48da3d",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 3.6 Getting data from dataframe (get multiple column from a dataframe)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 26,
|
||
"id": "a39c7d23",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Open</th>\n",
|
||
" <th>Close</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-01</th>\n",
|
||
" <td>62.384998</td>\n",
|
||
" <td>63.955002</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-04</th>\n",
|
||
" <td>64.332497</td>\n",
|
||
" <td>64.375000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-05</th>\n",
|
||
" <td>64.262497</td>\n",
|
||
" <td>64.282501</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-06</th>\n",
|
||
" <td>64.192497</td>\n",
|
||
" <td>64.309998</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-07</th>\n",
|
||
" <td>64.684998</td>\n",
|
||
" <td>64.857498</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-08</th>\n",
|
||
" <td>64.672501</td>\n",
|
||
" <td>65.035004</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-11</th>\n",
|
||
" <td>64.574997</td>\n",
|
||
" <td>65.550003</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-12</th>\n",
|
||
" <td>65.387497</td>\n",
|
||
" <td>65.489998</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-13</th>\n",
|
||
" <td>65.282501</td>\n",
|
||
" <td>66.117500</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-14</th>\n",
|
||
" <td>65.937500</td>\n",
|
||
" <td>65.660004</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-15</th>\n",
|
||
" <td>65.919998</td>\n",
|
||
" <td>66.440002</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-18</th>\n",
|
||
" <td>66.449997</td>\n",
|
||
" <td>66.775002</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-19</th>\n",
|
||
" <td>66.974998</td>\n",
|
||
" <td>66.572502</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-20</th>\n",
|
||
" <td>66.385002</td>\n",
|
||
" <td>65.797501</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-21</th>\n",
|
||
" <td>65.922501</td>\n",
|
||
" <td>65.502502</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-22</th>\n",
|
||
" <td>65.647499</td>\n",
|
||
" <td>65.445000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-25</th>\n",
|
||
" <td>65.677498</td>\n",
|
||
" <td>66.592499</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-26</th>\n",
|
||
" <td>66.735001</td>\n",
|
||
" <td>66.072502</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-27</th>\n",
|
||
" <td>66.394997</td>\n",
|
||
" <td>66.959999</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-29</th>\n",
|
||
" <td>66.650002</td>\n",
|
||
" <td>66.812500</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Open Close\n",
|
||
"Date \n",
|
||
"2019-11-01 62.384998 63.955002\n",
|
||
"2019-11-04 64.332497 64.375000\n",
|
||
"2019-11-05 64.262497 64.282501\n",
|
||
"2019-11-06 64.192497 64.309998\n",
|
||
"2019-11-07 64.684998 64.857498\n",
|
||
"2019-11-08 64.672501 65.035004\n",
|
||
"2019-11-11 64.574997 65.550003\n",
|
||
"2019-11-12 65.387497 65.489998\n",
|
||
"2019-11-13 65.282501 66.117500\n",
|
||
"2019-11-14 65.937500 65.660004\n",
|
||
"2019-11-15 65.919998 66.440002\n",
|
||
"2019-11-18 66.449997 66.775002\n",
|
||
"2019-11-19 66.974998 66.572502\n",
|
||
"2019-11-20 66.385002 65.797501\n",
|
||
"2019-11-21 65.922501 65.502502\n",
|
||
"2019-11-22 65.647499 65.445000\n",
|
||
"2019-11-25 65.677498 66.592499\n",
|
||
"2019-11-26 66.735001 66.072502\n",
|
||
"2019-11-27 66.394997 66.959999\n",
|
||
"2019-11-29 66.650002 66.812500"
|
||
]
|
||
},
|
||
"execution_count": 26,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index.loc[\"2019-11\"][[\"Open\",\"Close\"]]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "7fe98d18",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 3.7 Getting data from dataframe (that's not a date/integer)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 27,
|
||
"id": "a3e09fd0",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"sales 10000\n",
|
||
"costs 300000\n",
|
||
"Name: Central Branch, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 27,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"branch_summary.loc[\"Central Branch\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 28,
|
||
"id": "114ba40f",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Central Branch 10000\n",
|
||
"TST Branch 2000\n",
|
||
"Mongkok Branch 3000\n",
|
||
"Name: sales, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 28,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"branch_summary[\"sales\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "15222ed8",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 3.8 Getting data from dataframe (using implicit index)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 29,
|
||
"id": "72c1e9ba",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Index(['Central Branch', 'TST Branch', 'Mongkok Branch'], dtype='object')"
|
||
]
|
||
},
|
||
"execution_count": 29,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"branch_summary.index"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 30,
|
||
"id": "e34b9495",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"DatetimeIndex(['2019-10-28', '2019-10-29', '2019-10-30', '2019-10-31',\n",
|
||
" '2019-11-01', '2019-11-04', '2019-11-05', '2019-11-06',\n",
|
||
" '2019-11-07', '2019-11-08',\n",
|
||
" ...\n",
|
||
" '2020-10-14', '2020-10-15', '2020-10-16', '2020-10-19',\n",
|
||
" '2020-10-20', '2020-10-21', '2020-10-22', '2020-10-23',\n",
|
||
" '2020-10-26', '2020-10-27'],\n",
|
||
" dtype='datetime64[ns]', name='Date', length=253, freq=None)"
|
||
]
|
||
},
|
||
"execution_count": 30,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index.index"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 31,
|
||
"id": "a32d711d",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"RangeIndex(start=0, stop=253, step=1)"
|
||
]
|
||
},
|
||
"execution_count": 31,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl.index"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 32,
|
||
"id": "0836e844",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Open</th>\n",
|
||
" <th>High</th>\n",
|
||
" <th>Low</th>\n",
|
||
" <th>Close</th>\n",
|
||
" <th>Adj Close</th>\n",
|
||
" <th>Volume</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2019-10-28</th>\n",
|
||
" <td>61.855000</td>\n",
|
||
" <td>62.312500</td>\n",
|
||
" <td>61.680000</td>\n",
|
||
" <td>62.262501</td>\n",
|
||
" <td>61.650810</td>\n",
|
||
" <td>96572800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-10-29</th>\n",
|
||
" <td>62.242500</td>\n",
|
||
" <td>62.437500</td>\n",
|
||
" <td>60.642502</td>\n",
|
||
" <td>60.822498</td>\n",
|
||
" <td>60.224953</td>\n",
|
||
" <td>142839600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-10-30</th>\n",
|
||
" <td>61.189999</td>\n",
|
||
" <td>61.325001</td>\n",
|
||
" <td>60.302502</td>\n",
|
||
" <td>60.814999</td>\n",
|
||
" <td>60.217525</td>\n",
|
||
" <td>124522000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-10-31</th>\n",
|
||
" <td>61.810001</td>\n",
|
||
" <td>62.292500</td>\n",
|
||
" <td>59.314999</td>\n",
|
||
" <td>62.189999</td>\n",
|
||
" <td>61.579021</td>\n",
|
||
" <td>139162000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-01</th>\n",
|
||
" <td>62.384998</td>\n",
|
||
" <td>63.982498</td>\n",
|
||
" <td>62.290001</td>\n",
|
||
" <td>63.955002</td>\n",
|
||
" <td>63.326683</td>\n",
|
||
" <td>151125200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-04</th>\n",
|
||
" <td>64.332497</td>\n",
|
||
" <td>64.462502</td>\n",
|
||
" <td>63.845001</td>\n",
|
||
" <td>64.375000</td>\n",
|
||
" <td>63.742554</td>\n",
|
||
" <td>103272000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-05</th>\n",
|
||
" <td>64.262497</td>\n",
|
||
" <td>64.547501</td>\n",
|
||
" <td>64.080002</td>\n",
|
||
" <td>64.282501</td>\n",
|
||
" <td>63.650970</td>\n",
|
||
" <td>79897600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-06</th>\n",
|
||
" <td>64.192497</td>\n",
|
||
" <td>64.372498</td>\n",
|
||
" <td>63.842499</td>\n",
|
||
" <td>64.309998</td>\n",
|
||
" <td>63.678192</td>\n",
|
||
" <td>75864400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-07</th>\n",
|
||
" <td>64.684998</td>\n",
|
||
" <td>65.087502</td>\n",
|
||
" <td>64.527496</td>\n",
|
||
" <td>64.857498</td>\n",
|
||
" <td>64.413116</td>\n",
|
||
" <td>94940400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-08</th>\n",
|
||
" <td>64.672501</td>\n",
|
||
" <td>65.110001</td>\n",
|
||
" <td>64.212502</td>\n",
|
||
" <td>65.035004</td>\n",
|
||
" <td>64.589409</td>\n",
|
||
" <td>69986400</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Open High Low Close Adj Close Volume\n",
|
||
"Date \n",
|
||
"2019-10-28 61.855000 62.312500 61.680000 62.262501 61.650810 96572800\n",
|
||
"2019-10-29 62.242500 62.437500 60.642502 60.822498 60.224953 142839600\n",
|
||
"2019-10-30 61.189999 61.325001 60.302502 60.814999 60.217525 124522000\n",
|
||
"2019-10-31 61.810001 62.292500 59.314999 62.189999 61.579021 139162000\n",
|
||
"2019-11-01 62.384998 63.982498 62.290001 63.955002 63.326683 151125200\n",
|
||
"2019-11-04 64.332497 64.462502 63.845001 64.375000 63.742554 103272000\n",
|
||
"2019-11-05 64.262497 64.547501 64.080002 64.282501 63.650970 79897600\n",
|
||
"2019-11-06 64.192497 64.372498 63.842499 64.309998 63.678192 75864400\n",
|
||
"2019-11-07 64.684998 65.087502 64.527496 64.857498 64.413116 94940400\n",
|
||
"2019-11-08 64.672501 65.110001 64.212502 65.035004 64.589409 69986400"
|
||
]
|
||
},
|
||
"execution_count": 32,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index.iloc[0:10]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d7a78ecf",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 4. Filtering"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "7706e9dd",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 4.1 Single condition"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 33,
|
||
"id": "ba29f518",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Date\n",
|
||
"2019-10-28 False\n",
|
||
"2019-10-29 False\n",
|
||
"2019-10-30 False\n",
|
||
"2019-10-31 False\n",
|
||
"2019-11-01 False\n",
|
||
" ... \n",
|
||
"2020-10-21 True\n",
|
||
"2020-10-22 True\n",
|
||
"2020-10-23 True\n",
|
||
"2020-10-26 True\n",
|
||
"2020-10-27 True\n",
|
||
"Name: Open, Length: 253, dtype: bool"
|
||
]
|
||
},
|
||
"execution_count": 33,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index[\"Open\"] > 100"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 34,
|
||
"id": "22ac3ad1",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Open</th>\n",
|
||
" <th>High</th>\n",
|
||
" <th>Low</th>\n",
|
||
" <th>Close</th>\n",
|
||
" <th>Adj Close</th>\n",
|
||
" <th>Volume</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2020-07-31</th>\n",
|
||
" <td>102.885002</td>\n",
|
||
" <td>106.415001</td>\n",
|
||
" <td>100.824997</td>\n",
|
||
" <td>106.260002</td>\n",
|
||
" <td>106.068756</td>\n",
|
||
" <td>374336800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-03</th>\n",
|
||
" <td>108.199997</td>\n",
|
||
" <td>111.637497</td>\n",
|
||
" <td>107.892502</td>\n",
|
||
" <td>108.937500</td>\n",
|
||
" <td>108.741440</td>\n",
|
||
" <td>308151200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-04</th>\n",
|
||
" <td>109.132500</td>\n",
|
||
" <td>110.790001</td>\n",
|
||
" <td>108.387497</td>\n",
|
||
" <td>109.665001</td>\n",
|
||
" <td>109.467628</td>\n",
|
||
" <td>173071600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-05</th>\n",
|
||
" <td>109.377502</td>\n",
|
||
" <td>110.392502</td>\n",
|
||
" <td>108.897499</td>\n",
|
||
" <td>110.062500</td>\n",
|
||
" <td>109.864410</td>\n",
|
||
" <td>121992000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-06</th>\n",
|
||
" <td>110.404999</td>\n",
|
||
" <td>114.412498</td>\n",
|
||
" <td>109.797501</td>\n",
|
||
" <td>113.902496</td>\n",
|
||
" <td>113.697502</td>\n",
|
||
" <td>202428800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-21</th>\n",
|
||
" <td>116.669998</td>\n",
|
||
" <td>118.709999</td>\n",
|
||
" <td>116.449997</td>\n",
|
||
" <td>116.870003</td>\n",
|
||
" <td>116.870003</td>\n",
|
||
" <td>89946000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-22</th>\n",
|
||
" <td>117.449997</td>\n",
|
||
" <td>118.040001</td>\n",
|
||
" <td>114.589996</td>\n",
|
||
" <td>115.750000</td>\n",
|
||
" <td>115.750000</td>\n",
|
||
" <td>101988000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-23</th>\n",
|
||
" <td>116.389999</td>\n",
|
||
" <td>116.550003</td>\n",
|
||
" <td>114.279999</td>\n",
|
||
" <td>115.040001</td>\n",
|
||
" <td>115.040001</td>\n",
|
||
" <td>82572600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-26</th>\n",
|
||
" <td>114.010002</td>\n",
|
||
" <td>116.550003</td>\n",
|
||
" <td>112.879997</td>\n",
|
||
" <td>115.050003</td>\n",
|
||
" <td>115.050003</td>\n",
|
||
" <td>111850700</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-27</th>\n",
|
||
" <td>115.489998</td>\n",
|
||
" <td>117.279999</td>\n",
|
||
" <td>114.540001</td>\n",
|
||
" <td>116.599998</td>\n",
|
||
" <td>116.599998</td>\n",
|
||
" <td>91927700</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>62 rows × 6 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Open High Low Close Adj Close \\\n",
|
||
"Date \n",
|
||
"2020-07-31 102.885002 106.415001 100.824997 106.260002 106.068756 \n",
|
||
"2020-08-03 108.199997 111.637497 107.892502 108.937500 108.741440 \n",
|
||
"2020-08-04 109.132500 110.790001 108.387497 109.665001 109.467628 \n",
|
||
"2020-08-05 109.377502 110.392502 108.897499 110.062500 109.864410 \n",
|
||
"2020-08-06 110.404999 114.412498 109.797501 113.902496 113.697502 \n",
|
||
"... ... ... ... ... ... \n",
|
||
"2020-10-21 116.669998 118.709999 116.449997 116.870003 116.870003 \n",
|
||
"2020-10-22 117.449997 118.040001 114.589996 115.750000 115.750000 \n",
|
||
"2020-10-23 116.389999 116.550003 114.279999 115.040001 115.040001 \n",
|
||
"2020-10-26 114.010002 116.550003 112.879997 115.050003 115.050003 \n",
|
||
"2020-10-27 115.489998 117.279999 114.540001 116.599998 116.599998 \n",
|
||
"\n",
|
||
" Volume \n",
|
||
"Date \n",
|
||
"2020-07-31 374336800 \n",
|
||
"2020-08-03 308151200 \n",
|
||
"2020-08-04 173071600 \n",
|
||
"2020-08-05 121992000 \n",
|
||
"2020-08-06 202428800 \n",
|
||
"... ... \n",
|
||
"2020-10-21 89946000 \n",
|
||
"2020-10-22 101988000 \n",
|
||
"2020-10-23 82572600 \n",
|
||
"2020-10-26 111850700 \n",
|
||
"2020-10-27 91927700 \n",
|
||
"\n",
|
||
"[62 rows x 6 columns]"
|
||
]
|
||
},
|
||
"execution_count": 34,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index[aapl_proper_index[\"Open\"] > 100]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d0ff9063",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 4.2 multiple condition"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 35,
|
||
"id": "1c5574e2",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Date\n",
|
||
"2019-10-28 False\n",
|
||
"2019-10-29 False\n",
|
||
"2019-10-30 False\n",
|
||
"2019-10-31 False\n",
|
||
"2019-11-01 False\n",
|
||
" ... \n",
|
||
"2020-10-21 False\n",
|
||
"2020-10-22 True\n",
|
||
"2020-10-23 False\n",
|
||
"2020-10-26 True\n",
|
||
"2020-10-27 False\n",
|
||
"Length: 253, dtype: bool"
|
||
]
|
||
},
|
||
"execution_count": 35,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"(aapl_proper_index[\"Open\"] > 100) & (aapl_proper_index[\"Volume\"] > 100000000)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 36,
|
||
"id": "6ab72ebf",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"cond = (aapl_proper_index[\"Open\"] > 100) & (aapl_proper_index[\"Volume\"] > 100000000)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 37,
|
||
"id": "cfe5d612",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Open</th>\n",
|
||
" <th>High</th>\n",
|
||
" <th>Low</th>\n",
|
||
" <th>Close</th>\n",
|
||
" <th>Adj Close</th>\n",
|
||
" <th>Volume</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2020-07-31</th>\n",
|
||
" <td>102.885002</td>\n",
|
||
" <td>106.415001</td>\n",
|
||
" <td>100.824997</td>\n",
|
||
" <td>106.260002</td>\n",
|
||
" <td>106.068756</td>\n",
|
||
" <td>374336800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-03</th>\n",
|
||
" <td>108.199997</td>\n",
|
||
" <td>111.637497</td>\n",
|
||
" <td>107.892502</td>\n",
|
||
" <td>108.937500</td>\n",
|
||
" <td>108.741440</td>\n",
|
||
" <td>308151200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-04</th>\n",
|
||
" <td>109.132500</td>\n",
|
||
" <td>110.790001</td>\n",
|
||
" <td>108.387497</td>\n",
|
||
" <td>109.665001</td>\n",
|
||
" <td>109.467628</td>\n",
|
||
" <td>173071600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-05</th>\n",
|
||
" <td>109.377502</td>\n",
|
||
" <td>110.392502</td>\n",
|
||
" <td>108.897499</td>\n",
|
||
" <td>110.062500</td>\n",
|
||
" <td>109.864410</td>\n",
|
||
" <td>121992000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-06</th>\n",
|
||
" <td>110.404999</td>\n",
|
||
" <td>114.412498</td>\n",
|
||
" <td>109.797501</td>\n",
|
||
" <td>113.902496</td>\n",
|
||
" <td>113.697502</td>\n",
|
||
" <td>202428800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-07</th>\n",
|
||
" <td>113.205002</td>\n",
|
||
" <td>113.675003</td>\n",
|
||
" <td>110.292503</td>\n",
|
||
" <td>111.112503</td>\n",
|
||
" <td>111.112503</td>\n",
|
||
" <td>198045600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-10</th>\n",
|
||
" <td>112.599998</td>\n",
|
||
" <td>113.775002</td>\n",
|
||
" <td>110.000000</td>\n",
|
||
" <td>112.727501</td>\n",
|
||
" <td>112.727501</td>\n",
|
||
" <td>212403600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-11</th>\n",
|
||
" <td>111.970001</td>\n",
|
||
" <td>112.482498</td>\n",
|
||
" <td>109.107498</td>\n",
|
||
" <td>109.375000</td>\n",
|
||
" <td>109.375000</td>\n",
|
||
" <td>187902400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-12</th>\n",
|
||
" <td>110.497498</td>\n",
|
||
" <td>113.275002</td>\n",
|
||
" <td>110.297501</td>\n",
|
||
" <td>113.010002</td>\n",
|
||
" <td>113.010002</td>\n",
|
||
" <td>165944800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-13</th>\n",
|
||
" <td>114.430000</td>\n",
|
||
" <td>116.042503</td>\n",
|
||
" <td>113.927498</td>\n",
|
||
" <td>115.010002</td>\n",
|
||
" <td>115.010002</td>\n",
|
||
" <td>210082000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-14</th>\n",
|
||
" <td>114.830002</td>\n",
|
||
" <td>115.000000</td>\n",
|
||
" <td>113.044998</td>\n",
|
||
" <td>114.907501</td>\n",
|
||
" <td>114.907501</td>\n",
|
||
" <td>165565200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-17</th>\n",
|
||
" <td>116.062500</td>\n",
|
||
" <td>116.087502</td>\n",
|
||
" <td>113.962502</td>\n",
|
||
" <td>114.607498</td>\n",
|
||
" <td>114.607498</td>\n",
|
||
" <td>119561600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-18</th>\n",
|
||
" <td>114.352501</td>\n",
|
||
" <td>116.000000</td>\n",
|
||
" <td>114.007500</td>\n",
|
||
" <td>115.562500</td>\n",
|
||
" <td>115.562500</td>\n",
|
||
" <td>105633600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-19</th>\n",
|
||
" <td>115.982498</td>\n",
|
||
" <td>117.162498</td>\n",
|
||
" <td>115.610001</td>\n",
|
||
" <td>115.707497</td>\n",
|
||
" <td>115.707497</td>\n",
|
||
" <td>145538000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-20</th>\n",
|
||
" <td>115.750000</td>\n",
|
||
" <td>118.392502</td>\n",
|
||
" <td>115.732498</td>\n",
|
||
" <td>118.275002</td>\n",
|
||
" <td>118.275002</td>\n",
|
||
" <td>126907200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-21</th>\n",
|
||
" <td>119.262497</td>\n",
|
||
" <td>124.867500</td>\n",
|
||
" <td>119.250000</td>\n",
|
||
" <td>124.370003</td>\n",
|
||
" <td>124.370003</td>\n",
|
||
" <td>338054800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-24</th>\n",
|
||
" <td>128.697495</td>\n",
|
||
" <td>128.785004</td>\n",
|
||
" <td>123.937500</td>\n",
|
||
" <td>125.857498</td>\n",
|
||
" <td>125.857498</td>\n",
|
||
" <td>345937600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-25</th>\n",
|
||
" <td>124.697502</td>\n",
|
||
" <td>125.180000</td>\n",
|
||
" <td>123.052498</td>\n",
|
||
" <td>124.824997</td>\n",
|
||
" <td>124.824997</td>\n",
|
||
" <td>211495600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-26</th>\n",
|
||
" <td>126.180000</td>\n",
|
||
" <td>126.992500</td>\n",
|
||
" <td>125.082497</td>\n",
|
||
" <td>126.522499</td>\n",
|
||
" <td>126.522499</td>\n",
|
||
" <td>163022400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-27</th>\n",
|
||
" <td>127.142502</td>\n",
|
||
" <td>127.485001</td>\n",
|
||
" <td>123.832497</td>\n",
|
||
" <td>125.010002</td>\n",
|
||
" <td>125.010002</td>\n",
|
||
" <td>155552400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-28</th>\n",
|
||
" <td>126.012497</td>\n",
|
||
" <td>126.442497</td>\n",
|
||
" <td>124.577499</td>\n",
|
||
" <td>124.807503</td>\n",
|
||
" <td>124.807503</td>\n",
|
||
" <td>187630000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-31</th>\n",
|
||
" <td>127.580002</td>\n",
|
||
" <td>131.000000</td>\n",
|
||
" <td>126.000000</td>\n",
|
||
" <td>129.039993</td>\n",
|
||
" <td>129.039993</td>\n",
|
||
" <td>225702700</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-01</th>\n",
|
||
" <td>132.759995</td>\n",
|
||
" <td>134.800003</td>\n",
|
||
" <td>130.529999</td>\n",
|
||
" <td>134.179993</td>\n",
|
||
" <td>134.179993</td>\n",
|
||
" <td>152470100</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-02</th>\n",
|
||
" <td>137.589996</td>\n",
|
||
" <td>137.979996</td>\n",
|
||
" <td>127.000000</td>\n",
|
||
" <td>131.399994</td>\n",
|
||
" <td>131.399994</td>\n",
|
||
" <td>200119000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-03</th>\n",
|
||
" <td>126.910004</td>\n",
|
||
" <td>128.839996</td>\n",
|
||
" <td>120.500000</td>\n",
|
||
" <td>120.879997</td>\n",
|
||
" <td>120.879997</td>\n",
|
||
" <td>257599600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-04</th>\n",
|
||
" <td>120.070000</td>\n",
|
||
" <td>123.699997</td>\n",
|
||
" <td>110.889999</td>\n",
|
||
" <td>120.959999</td>\n",
|
||
" <td>120.959999</td>\n",
|
||
" <td>332607200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-08</th>\n",
|
||
" <td>113.949997</td>\n",
|
||
" <td>118.989998</td>\n",
|
||
" <td>112.680000</td>\n",
|
||
" <td>112.820000</td>\n",
|
||
" <td>112.820000</td>\n",
|
||
" <td>231366600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-09</th>\n",
|
||
" <td>117.260002</td>\n",
|
||
" <td>119.139999</td>\n",
|
||
" <td>115.260002</td>\n",
|
||
" <td>117.320000</td>\n",
|
||
" <td>117.320000</td>\n",
|
||
" <td>176940500</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-10</th>\n",
|
||
" <td>120.360001</td>\n",
|
||
" <td>120.500000</td>\n",
|
||
" <td>112.500000</td>\n",
|
||
" <td>113.489998</td>\n",
|
||
" <td>113.489998</td>\n",
|
||
" <td>182274400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-11</th>\n",
|
||
" <td>114.570000</td>\n",
|
||
" <td>115.230003</td>\n",
|
||
" <td>110.000000</td>\n",
|
||
" <td>112.000000</td>\n",
|
||
" <td>112.000000</td>\n",
|
||
" <td>180860300</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-14</th>\n",
|
||
" <td>114.720001</td>\n",
|
||
" <td>115.930000</td>\n",
|
||
" <td>112.800003</td>\n",
|
||
" <td>115.360001</td>\n",
|
||
" <td>115.360001</td>\n",
|
||
" <td>140150100</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-15</th>\n",
|
||
" <td>118.330002</td>\n",
|
||
" <td>118.830002</td>\n",
|
||
" <td>113.610001</td>\n",
|
||
" <td>115.540001</td>\n",
|
||
" <td>115.540001</td>\n",
|
||
" <td>184642000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-16</th>\n",
|
||
" <td>115.230003</td>\n",
|
||
" <td>116.000000</td>\n",
|
||
" <td>112.040001</td>\n",
|
||
" <td>112.129997</td>\n",
|
||
" <td>112.129997</td>\n",
|
||
" <td>154679000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-17</th>\n",
|
||
" <td>109.720001</td>\n",
|
||
" <td>112.199997</td>\n",
|
||
" <td>108.709999</td>\n",
|
||
" <td>110.339996</td>\n",
|
||
" <td>110.339996</td>\n",
|
||
" <td>178011000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-18</th>\n",
|
||
" <td>110.400002</td>\n",
|
||
" <td>110.879997</td>\n",
|
||
" <td>106.089996</td>\n",
|
||
" <td>106.839996</td>\n",
|
||
" <td>106.839996</td>\n",
|
||
" <td>287104900</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-21</th>\n",
|
||
" <td>104.540001</td>\n",
|
||
" <td>110.190002</td>\n",
|
||
" <td>103.099998</td>\n",
|
||
" <td>110.080002</td>\n",
|
||
" <td>110.080002</td>\n",
|
||
" <td>195713800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-22</th>\n",
|
||
" <td>112.680000</td>\n",
|
||
" <td>112.860001</td>\n",
|
||
" <td>109.160004</td>\n",
|
||
" <td>111.809998</td>\n",
|
||
" <td>111.809998</td>\n",
|
||
" <td>183055400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-23</th>\n",
|
||
" <td>111.620003</td>\n",
|
||
" <td>112.110001</td>\n",
|
||
" <td>106.769997</td>\n",
|
||
" <td>107.120003</td>\n",
|
||
" <td>107.120003</td>\n",
|
||
" <td>150718700</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-24</th>\n",
|
||
" <td>105.169998</td>\n",
|
||
" <td>110.250000</td>\n",
|
||
" <td>105.000000</td>\n",
|
||
" <td>108.220001</td>\n",
|
||
" <td>108.220001</td>\n",
|
||
" <td>167743300</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-25</th>\n",
|
||
" <td>108.430000</td>\n",
|
||
" <td>112.440002</td>\n",
|
||
" <td>107.669998</td>\n",
|
||
" <td>112.279999</td>\n",
|
||
" <td>112.279999</td>\n",
|
||
" <td>149981400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-28</th>\n",
|
||
" <td>115.010002</td>\n",
|
||
" <td>115.320000</td>\n",
|
||
" <td>112.779999</td>\n",
|
||
" <td>114.959999</td>\n",
|
||
" <td>114.959999</td>\n",
|
||
" <td>137672400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-09-30</th>\n",
|
||
" <td>113.790001</td>\n",
|
||
" <td>117.260002</td>\n",
|
||
" <td>113.620003</td>\n",
|
||
" <td>115.809998</td>\n",
|
||
" <td>115.809998</td>\n",
|
||
" <td>142675200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-01</th>\n",
|
||
" <td>117.639999</td>\n",
|
||
" <td>117.720001</td>\n",
|
||
" <td>115.830002</td>\n",
|
||
" <td>116.790001</td>\n",
|
||
" <td>116.790001</td>\n",
|
||
" <td>116120400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-02</th>\n",
|
||
" <td>112.889999</td>\n",
|
||
" <td>115.370003</td>\n",
|
||
" <td>112.220001</td>\n",
|
||
" <td>113.019997</td>\n",
|
||
" <td>113.019997</td>\n",
|
||
" <td>144712000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-05</th>\n",
|
||
" <td>113.910004</td>\n",
|
||
" <td>116.650002</td>\n",
|
||
" <td>113.550003</td>\n",
|
||
" <td>116.500000</td>\n",
|
||
" <td>116.500000</td>\n",
|
||
" <td>106243800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-06</th>\n",
|
||
" <td>115.699997</td>\n",
|
||
" <td>116.120003</td>\n",
|
||
" <td>112.250000</td>\n",
|
||
" <td>113.160004</td>\n",
|
||
" <td>113.160004</td>\n",
|
||
" <td>161498200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-09</th>\n",
|
||
" <td>115.279999</td>\n",
|
||
" <td>117.000000</td>\n",
|
||
" <td>114.919998</td>\n",
|
||
" <td>116.970001</td>\n",
|
||
" <td>116.970001</td>\n",
|
||
" <td>100506900</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-12</th>\n",
|
||
" <td>120.059998</td>\n",
|
||
" <td>125.180000</td>\n",
|
||
" <td>119.279999</td>\n",
|
||
" <td>124.400002</td>\n",
|
||
" <td>124.400002</td>\n",
|
||
" <td>240226800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-13</th>\n",
|
||
" <td>125.269997</td>\n",
|
||
" <td>125.389999</td>\n",
|
||
" <td>119.650002</td>\n",
|
||
" <td>121.099998</td>\n",
|
||
" <td>121.099998</td>\n",
|
||
" <td>262330500</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-14</th>\n",
|
||
" <td>121.000000</td>\n",
|
||
" <td>123.029999</td>\n",
|
||
" <td>119.620003</td>\n",
|
||
" <td>121.190002</td>\n",
|
||
" <td>121.190002</td>\n",
|
||
" <td>151062300</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-15</th>\n",
|
||
" <td>118.720001</td>\n",
|
||
" <td>121.199997</td>\n",
|
||
" <td>118.150002</td>\n",
|
||
" <td>120.709999</td>\n",
|
||
" <td>120.709999</td>\n",
|
||
" <td>112559200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-16</th>\n",
|
||
" <td>121.279999</td>\n",
|
||
" <td>121.550003</td>\n",
|
||
" <td>118.809998</td>\n",
|
||
" <td>119.019997</td>\n",
|
||
" <td>119.019997</td>\n",
|
||
" <td>115393800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-19</th>\n",
|
||
" <td>119.959999</td>\n",
|
||
" <td>120.419998</td>\n",
|
||
" <td>115.660004</td>\n",
|
||
" <td>115.980003</td>\n",
|
||
" <td>115.980003</td>\n",
|
||
" <td>120639300</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-20</th>\n",
|
||
" <td>116.199997</td>\n",
|
||
" <td>118.980003</td>\n",
|
||
" <td>115.629997</td>\n",
|
||
" <td>117.510002</td>\n",
|
||
" <td>117.510002</td>\n",
|
||
" <td>124423700</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-22</th>\n",
|
||
" <td>117.449997</td>\n",
|
||
" <td>118.040001</td>\n",
|
||
" <td>114.589996</td>\n",
|
||
" <td>115.750000</td>\n",
|
||
" <td>115.750000</td>\n",
|
||
" <td>101988000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-26</th>\n",
|
||
" <td>114.010002</td>\n",
|
||
" <td>116.550003</td>\n",
|
||
" <td>112.879997</td>\n",
|
||
" <td>115.050003</td>\n",
|
||
" <td>115.050003</td>\n",
|
||
" <td>111850700</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Open High Low Close Adj Close \\\n",
|
||
"Date \n",
|
||
"2020-07-31 102.885002 106.415001 100.824997 106.260002 106.068756 \n",
|
||
"2020-08-03 108.199997 111.637497 107.892502 108.937500 108.741440 \n",
|
||
"2020-08-04 109.132500 110.790001 108.387497 109.665001 109.467628 \n",
|
||
"2020-08-05 109.377502 110.392502 108.897499 110.062500 109.864410 \n",
|
||
"2020-08-06 110.404999 114.412498 109.797501 113.902496 113.697502 \n",
|
||
"2020-08-07 113.205002 113.675003 110.292503 111.112503 111.112503 \n",
|
||
"2020-08-10 112.599998 113.775002 110.000000 112.727501 112.727501 \n",
|
||
"2020-08-11 111.970001 112.482498 109.107498 109.375000 109.375000 \n",
|
||
"2020-08-12 110.497498 113.275002 110.297501 113.010002 113.010002 \n",
|
||
"2020-08-13 114.430000 116.042503 113.927498 115.010002 115.010002 \n",
|
||
"2020-08-14 114.830002 115.000000 113.044998 114.907501 114.907501 \n",
|
||
"2020-08-17 116.062500 116.087502 113.962502 114.607498 114.607498 \n",
|
||
"2020-08-18 114.352501 116.000000 114.007500 115.562500 115.562500 \n",
|
||
"2020-08-19 115.982498 117.162498 115.610001 115.707497 115.707497 \n",
|
||
"2020-08-20 115.750000 118.392502 115.732498 118.275002 118.275002 \n",
|
||
"2020-08-21 119.262497 124.867500 119.250000 124.370003 124.370003 \n",
|
||
"2020-08-24 128.697495 128.785004 123.937500 125.857498 125.857498 \n",
|
||
"2020-08-25 124.697502 125.180000 123.052498 124.824997 124.824997 \n",
|
||
"2020-08-26 126.180000 126.992500 125.082497 126.522499 126.522499 \n",
|
||
"2020-08-27 127.142502 127.485001 123.832497 125.010002 125.010002 \n",
|
||
"2020-08-28 126.012497 126.442497 124.577499 124.807503 124.807503 \n",
|
||
"2020-08-31 127.580002 131.000000 126.000000 129.039993 129.039993 \n",
|
||
"2020-09-01 132.759995 134.800003 130.529999 134.179993 134.179993 \n",
|
||
"2020-09-02 137.589996 137.979996 127.000000 131.399994 131.399994 \n",
|
||
"2020-09-03 126.910004 128.839996 120.500000 120.879997 120.879997 \n",
|
||
"2020-09-04 120.070000 123.699997 110.889999 120.959999 120.959999 \n",
|
||
"2020-09-08 113.949997 118.989998 112.680000 112.820000 112.820000 \n",
|
||
"2020-09-09 117.260002 119.139999 115.260002 117.320000 117.320000 \n",
|
||
"2020-09-10 120.360001 120.500000 112.500000 113.489998 113.489998 \n",
|
||
"2020-09-11 114.570000 115.230003 110.000000 112.000000 112.000000 \n",
|
||
"2020-09-14 114.720001 115.930000 112.800003 115.360001 115.360001 \n",
|
||
"2020-09-15 118.330002 118.830002 113.610001 115.540001 115.540001 \n",
|
||
"2020-09-16 115.230003 116.000000 112.040001 112.129997 112.129997 \n",
|
||
"2020-09-17 109.720001 112.199997 108.709999 110.339996 110.339996 \n",
|
||
"2020-09-18 110.400002 110.879997 106.089996 106.839996 106.839996 \n",
|
||
"2020-09-21 104.540001 110.190002 103.099998 110.080002 110.080002 \n",
|
||
"2020-09-22 112.680000 112.860001 109.160004 111.809998 111.809998 \n",
|
||
"2020-09-23 111.620003 112.110001 106.769997 107.120003 107.120003 \n",
|
||
"2020-09-24 105.169998 110.250000 105.000000 108.220001 108.220001 \n",
|
||
"2020-09-25 108.430000 112.440002 107.669998 112.279999 112.279999 \n",
|
||
"2020-09-28 115.010002 115.320000 112.779999 114.959999 114.959999 \n",
|
||
"2020-09-30 113.790001 117.260002 113.620003 115.809998 115.809998 \n",
|
||
"2020-10-01 117.639999 117.720001 115.830002 116.790001 116.790001 \n",
|
||
"2020-10-02 112.889999 115.370003 112.220001 113.019997 113.019997 \n",
|
||
"2020-10-05 113.910004 116.650002 113.550003 116.500000 116.500000 \n",
|
||
"2020-10-06 115.699997 116.120003 112.250000 113.160004 113.160004 \n",
|
||
"2020-10-09 115.279999 117.000000 114.919998 116.970001 116.970001 \n",
|
||
"2020-10-12 120.059998 125.180000 119.279999 124.400002 124.400002 \n",
|
||
"2020-10-13 125.269997 125.389999 119.650002 121.099998 121.099998 \n",
|
||
"2020-10-14 121.000000 123.029999 119.620003 121.190002 121.190002 \n",
|
||
"2020-10-15 118.720001 121.199997 118.150002 120.709999 120.709999 \n",
|
||
"2020-10-16 121.279999 121.550003 118.809998 119.019997 119.019997 \n",
|
||
"2020-10-19 119.959999 120.419998 115.660004 115.980003 115.980003 \n",
|
||
"2020-10-20 116.199997 118.980003 115.629997 117.510002 117.510002 \n",
|
||
"2020-10-22 117.449997 118.040001 114.589996 115.750000 115.750000 \n",
|
||
"2020-10-26 114.010002 116.550003 112.879997 115.050003 115.050003 \n",
|
||
"\n",
|
||
" Volume \n",
|
||
"Date \n",
|
||
"2020-07-31 374336800 \n",
|
||
"2020-08-03 308151200 \n",
|
||
"2020-08-04 173071600 \n",
|
||
"2020-08-05 121992000 \n",
|
||
"2020-08-06 202428800 \n",
|
||
"2020-08-07 198045600 \n",
|
||
"2020-08-10 212403600 \n",
|
||
"2020-08-11 187902400 \n",
|
||
"2020-08-12 165944800 \n",
|
||
"2020-08-13 210082000 \n",
|
||
"2020-08-14 165565200 \n",
|
||
"2020-08-17 119561600 \n",
|
||
"2020-08-18 105633600 \n",
|
||
"2020-08-19 145538000 \n",
|
||
"2020-08-20 126907200 \n",
|
||
"2020-08-21 338054800 \n",
|
||
"2020-08-24 345937600 \n",
|
||
"2020-08-25 211495600 \n",
|
||
"2020-08-26 163022400 \n",
|
||
"2020-08-27 155552400 \n",
|
||
"2020-08-28 187630000 \n",
|
||
"2020-08-31 225702700 \n",
|
||
"2020-09-01 152470100 \n",
|
||
"2020-09-02 200119000 \n",
|
||
"2020-09-03 257599600 \n",
|
||
"2020-09-04 332607200 \n",
|
||
"2020-09-08 231366600 \n",
|
||
"2020-09-09 176940500 \n",
|
||
"2020-09-10 182274400 \n",
|
||
"2020-09-11 180860300 \n",
|
||
"2020-09-14 140150100 \n",
|
||
"2020-09-15 184642000 \n",
|
||
"2020-09-16 154679000 \n",
|
||
"2020-09-17 178011000 \n",
|
||
"2020-09-18 287104900 \n",
|
||
"2020-09-21 195713800 \n",
|
||
"2020-09-22 183055400 \n",
|
||
"2020-09-23 150718700 \n",
|
||
"2020-09-24 167743300 \n",
|
||
"2020-09-25 149981400 \n",
|
||
"2020-09-28 137672400 \n",
|
||
"2020-09-30 142675200 \n",
|
||
"2020-10-01 116120400 \n",
|
||
"2020-10-02 144712000 \n",
|
||
"2020-10-05 106243800 \n",
|
||
"2020-10-06 161498200 \n",
|
||
"2020-10-09 100506900 \n",
|
||
"2020-10-12 240226800 \n",
|
||
"2020-10-13 262330500 \n",
|
||
"2020-10-14 151062300 \n",
|
||
"2020-10-15 112559200 \n",
|
||
"2020-10-16 115393800 \n",
|
||
"2020-10-19 120639300 \n",
|
||
"2020-10-20 124423700 \n",
|
||
"2020-10-22 101988000 \n",
|
||
"2020-10-26 111850700 "
|
||
]
|
||
},
|
||
"execution_count": 37,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index[cond]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "649b6f86",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Side notes"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "599252b7",
|
||
"metadata": {},
|
||
"source": [
|
||
"Showing the top "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 38,
|
||
"id": "3b0d0293",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Open</th>\n",
|
||
" <th>High</th>\n",
|
||
" <th>Low</th>\n",
|
||
" <th>Close</th>\n",
|
||
" <th>Adj Close</th>\n",
|
||
" <th>Volume</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2020-07-31</th>\n",
|
||
" <td>102.885002</td>\n",
|
||
" <td>106.415001</td>\n",
|
||
" <td>100.824997</td>\n",
|
||
" <td>106.260002</td>\n",
|
||
" <td>106.068756</td>\n",
|
||
" <td>374336800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-03</th>\n",
|
||
" <td>108.199997</td>\n",
|
||
" <td>111.637497</td>\n",
|
||
" <td>107.892502</td>\n",
|
||
" <td>108.937500</td>\n",
|
||
" <td>108.741440</td>\n",
|
||
" <td>308151200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-04</th>\n",
|
||
" <td>109.132500</td>\n",
|
||
" <td>110.790001</td>\n",
|
||
" <td>108.387497</td>\n",
|
||
" <td>109.665001</td>\n",
|
||
" <td>109.467628</td>\n",
|
||
" <td>173071600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-05</th>\n",
|
||
" <td>109.377502</td>\n",
|
||
" <td>110.392502</td>\n",
|
||
" <td>108.897499</td>\n",
|
||
" <td>110.062500</td>\n",
|
||
" <td>109.864410</td>\n",
|
||
" <td>121992000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-06</th>\n",
|
||
" <td>110.404999</td>\n",
|
||
" <td>114.412498</td>\n",
|
||
" <td>109.797501</td>\n",
|
||
" <td>113.902496</td>\n",
|
||
" <td>113.697502</td>\n",
|
||
" <td>202428800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-07</th>\n",
|
||
" <td>113.205002</td>\n",
|
||
" <td>113.675003</td>\n",
|
||
" <td>110.292503</td>\n",
|
||
" <td>111.112503</td>\n",
|
||
" <td>111.112503</td>\n",
|
||
" <td>198045600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-10</th>\n",
|
||
" <td>112.599998</td>\n",
|
||
" <td>113.775002</td>\n",
|
||
" <td>110.000000</td>\n",
|
||
" <td>112.727501</td>\n",
|
||
" <td>112.727501</td>\n",
|
||
" <td>212403600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-11</th>\n",
|
||
" <td>111.970001</td>\n",
|
||
" <td>112.482498</td>\n",
|
||
" <td>109.107498</td>\n",
|
||
" <td>109.375000</td>\n",
|
||
" <td>109.375000</td>\n",
|
||
" <td>187902400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-12</th>\n",
|
||
" <td>110.497498</td>\n",
|
||
" <td>113.275002</td>\n",
|
||
" <td>110.297501</td>\n",
|
||
" <td>113.010002</td>\n",
|
||
" <td>113.010002</td>\n",
|
||
" <td>165944800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-13</th>\n",
|
||
" <td>114.430000</td>\n",
|
||
" <td>116.042503</td>\n",
|
||
" <td>113.927498</td>\n",
|
||
" <td>115.010002</td>\n",
|
||
" <td>115.010002</td>\n",
|
||
" <td>210082000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-14</th>\n",
|
||
" <td>114.830002</td>\n",
|
||
" <td>115.000000</td>\n",
|
||
" <td>113.044998</td>\n",
|
||
" <td>114.907501</td>\n",
|
||
" <td>114.907501</td>\n",
|
||
" <td>165565200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-17</th>\n",
|
||
" <td>116.062500</td>\n",
|
||
" <td>116.087502</td>\n",
|
||
" <td>113.962502</td>\n",
|
||
" <td>114.607498</td>\n",
|
||
" <td>114.607498</td>\n",
|
||
" <td>119561600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-18</th>\n",
|
||
" <td>114.352501</td>\n",
|
||
" <td>116.000000</td>\n",
|
||
" <td>114.007500</td>\n",
|
||
" <td>115.562500</td>\n",
|
||
" <td>115.562500</td>\n",
|
||
" <td>105633600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-19</th>\n",
|
||
" <td>115.982498</td>\n",
|
||
" <td>117.162498</td>\n",
|
||
" <td>115.610001</td>\n",
|
||
" <td>115.707497</td>\n",
|
||
" <td>115.707497</td>\n",
|
||
" <td>145538000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-20</th>\n",
|
||
" <td>115.750000</td>\n",
|
||
" <td>118.392502</td>\n",
|
||
" <td>115.732498</td>\n",
|
||
" <td>118.275002</td>\n",
|
||
" <td>118.275002</td>\n",
|
||
" <td>126907200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-21</th>\n",
|
||
" <td>119.262497</td>\n",
|
||
" <td>124.867500</td>\n",
|
||
" <td>119.250000</td>\n",
|
||
" <td>124.370003</td>\n",
|
||
" <td>124.370003</td>\n",
|
||
" <td>338054800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-24</th>\n",
|
||
" <td>128.697495</td>\n",
|
||
" <td>128.785004</td>\n",
|
||
" <td>123.937500</td>\n",
|
||
" <td>125.857498</td>\n",
|
||
" <td>125.857498</td>\n",
|
||
" <td>345937600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-25</th>\n",
|
||
" <td>124.697502</td>\n",
|
||
" <td>125.180000</td>\n",
|
||
" <td>123.052498</td>\n",
|
||
" <td>124.824997</td>\n",
|
||
" <td>124.824997</td>\n",
|
||
" <td>211495600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-26</th>\n",
|
||
" <td>126.180000</td>\n",
|
||
" <td>126.992500</td>\n",
|
||
" <td>125.082497</td>\n",
|
||
" <td>126.522499</td>\n",
|
||
" <td>126.522499</td>\n",
|
||
" <td>163022400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-27</th>\n",
|
||
" <td>127.142502</td>\n",
|
||
" <td>127.485001</td>\n",
|
||
" <td>123.832497</td>\n",
|
||
" <td>125.010002</td>\n",
|
||
" <td>125.010002</td>\n",
|
||
" <td>155552400</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Open High Low Close Adj Close \\\n",
|
||
"Date \n",
|
||
"2020-07-31 102.885002 106.415001 100.824997 106.260002 106.068756 \n",
|
||
"2020-08-03 108.199997 111.637497 107.892502 108.937500 108.741440 \n",
|
||
"2020-08-04 109.132500 110.790001 108.387497 109.665001 109.467628 \n",
|
||
"2020-08-05 109.377502 110.392502 108.897499 110.062500 109.864410 \n",
|
||
"2020-08-06 110.404999 114.412498 109.797501 113.902496 113.697502 \n",
|
||
"2020-08-07 113.205002 113.675003 110.292503 111.112503 111.112503 \n",
|
||
"2020-08-10 112.599998 113.775002 110.000000 112.727501 112.727501 \n",
|
||
"2020-08-11 111.970001 112.482498 109.107498 109.375000 109.375000 \n",
|
||
"2020-08-12 110.497498 113.275002 110.297501 113.010002 113.010002 \n",
|
||
"2020-08-13 114.430000 116.042503 113.927498 115.010002 115.010002 \n",
|
||
"2020-08-14 114.830002 115.000000 113.044998 114.907501 114.907501 \n",
|
||
"2020-08-17 116.062500 116.087502 113.962502 114.607498 114.607498 \n",
|
||
"2020-08-18 114.352501 116.000000 114.007500 115.562500 115.562500 \n",
|
||
"2020-08-19 115.982498 117.162498 115.610001 115.707497 115.707497 \n",
|
||
"2020-08-20 115.750000 118.392502 115.732498 118.275002 118.275002 \n",
|
||
"2020-08-21 119.262497 124.867500 119.250000 124.370003 124.370003 \n",
|
||
"2020-08-24 128.697495 128.785004 123.937500 125.857498 125.857498 \n",
|
||
"2020-08-25 124.697502 125.180000 123.052498 124.824997 124.824997 \n",
|
||
"2020-08-26 126.180000 126.992500 125.082497 126.522499 126.522499 \n",
|
||
"2020-08-27 127.142502 127.485001 123.832497 125.010002 125.010002 \n",
|
||
"\n",
|
||
" Volume \n",
|
||
"Date \n",
|
||
"2020-07-31 374336800 \n",
|
||
"2020-08-03 308151200 \n",
|
||
"2020-08-04 173071600 \n",
|
||
"2020-08-05 121992000 \n",
|
||
"2020-08-06 202428800 \n",
|
||
"2020-08-07 198045600 \n",
|
||
"2020-08-10 212403600 \n",
|
||
"2020-08-11 187902400 \n",
|
||
"2020-08-12 165944800 \n",
|
||
"2020-08-13 210082000 \n",
|
||
"2020-08-14 165565200 \n",
|
||
"2020-08-17 119561600 \n",
|
||
"2020-08-18 105633600 \n",
|
||
"2020-08-19 145538000 \n",
|
||
"2020-08-20 126907200 \n",
|
||
"2020-08-21 338054800 \n",
|
||
"2020-08-24 345937600 \n",
|
||
"2020-08-25 211495600 \n",
|
||
"2020-08-26 163022400 \n",
|
||
"2020-08-27 155552400 "
|
||
]
|
||
},
|
||
"execution_count": 38,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index[cond].head(20)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 39,
|
||
"id": "d4bd5fc0",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Open</th>\n",
|
||
" <th>High</th>\n",
|
||
" <th>Low</th>\n",
|
||
" <th>Close</th>\n",
|
||
" <th>Adj Close</th>\n",
|
||
" <th>Volume</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-16</th>\n",
|
||
" <td>121.279999</td>\n",
|
||
" <td>121.550003</td>\n",
|
||
" <td>118.809998</td>\n",
|
||
" <td>119.019997</td>\n",
|
||
" <td>119.019997</td>\n",
|
||
" <td>115393800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-19</th>\n",
|
||
" <td>119.959999</td>\n",
|
||
" <td>120.419998</td>\n",
|
||
" <td>115.660004</td>\n",
|
||
" <td>115.980003</td>\n",
|
||
" <td>115.980003</td>\n",
|
||
" <td>120639300</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-20</th>\n",
|
||
" <td>116.199997</td>\n",
|
||
" <td>118.980003</td>\n",
|
||
" <td>115.629997</td>\n",
|
||
" <td>117.510002</td>\n",
|
||
" <td>117.510002</td>\n",
|
||
" <td>124423700</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-22</th>\n",
|
||
" <td>117.449997</td>\n",
|
||
" <td>118.040001</td>\n",
|
||
" <td>114.589996</td>\n",
|
||
" <td>115.750000</td>\n",
|
||
" <td>115.750000</td>\n",
|
||
" <td>101988000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-26</th>\n",
|
||
" <td>114.010002</td>\n",
|
||
" <td>116.550003</td>\n",
|
||
" <td>112.879997</td>\n",
|
||
" <td>115.050003</td>\n",
|
||
" <td>115.050003</td>\n",
|
||
" <td>111850700</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Open High Low Close Adj Close \\\n",
|
||
"Date \n",
|
||
"2020-10-16 121.279999 121.550003 118.809998 119.019997 119.019997 \n",
|
||
"2020-10-19 119.959999 120.419998 115.660004 115.980003 115.980003 \n",
|
||
"2020-10-20 116.199997 118.980003 115.629997 117.510002 117.510002 \n",
|
||
"2020-10-22 117.449997 118.040001 114.589996 115.750000 115.750000 \n",
|
||
"2020-10-26 114.010002 116.550003 112.879997 115.050003 115.050003 \n",
|
||
"\n",
|
||
" Volume \n",
|
||
"Date \n",
|
||
"2020-10-16 115393800 \n",
|
||
"2020-10-19 120639300 \n",
|
||
"2020-10-20 124423700 \n",
|
||
"2020-10-22 101988000 \n",
|
||
"2020-10-26 111850700 "
|
||
]
|
||
},
|
||
"execution_count": 39,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index[cond].tail(5)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "e7fcccd2",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 4.3 query"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 40,
|
||
"id": "f866b899",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Open</th>\n",
|
||
" <th>High</th>\n",
|
||
" <th>Low</th>\n",
|
||
" <th>Close</th>\n",
|
||
" <th>Adj Close</th>\n",
|
||
" <th>Volume</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2020-07-31</th>\n",
|
||
" <td>102.885002</td>\n",
|
||
" <td>106.415001</td>\n",
|
||
" <td>100.824997</td>\n",
|
||
" <td>106.260002</td>\n",
|
||
" <td>106.068756</td>\n",
|
||
" <td>374336800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-03</th>\n",
|
||
" <td>108.199997</td>\n",
|
||
" <td>111.637497</td>\n",
|
||
" <td>107.892502</td>\n",
|
||
" <td>108.937500</td>\n",
|
||
" <td>108.741440</td>\n",
|
||
" <td>308151200</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-04</th>\n",
|
||
" <td>109.132500</td>\n",
|
||
" <td>110.790001</td>\n",
|
||
" <td>108.387497</td>\n",
|
||
" <td>109.665001</td>\n",
|
||
" <td>109.467628</td>\n",
|
||
" <td>173071600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-05</th>\n",
|
||
" <td>109.377502</td>\n",
|
||
" <td>110.392502</td>\n",
|
||
" <td>108.897499</td>\n",
|
||
" <td>110.062500</td>\n",
|
||
" <td>109.864410</td>\n",
|
||
" <td>121992000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-06</th>\n",
|
||
" <td>110.404999</td>\n",
|
||
" <td>114.412498</td>\n",
|
||
" <td>109.797501</td>\n",
|
||
" <td>113.902496</td>\n",
|
||
" <td>113.697502</td>\n",
|
||
" <td>202428800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-07</th>\n",
|
||
" <td>113.205002</td>\n",
|
||
" <td>113.675003</td>\n",
|
||
" <td>110.292503</td>\n",
|
||
" <td>111.112503</td>\n",
|
||
" <td>111.112503</td>\n",
|
||
" <td>198045600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-10</th>\n",
|
||
" <td>112.599998</td>\n",
|
||
" <td>113.775002</td>\n",
|
||
" <td>110.000000</td>\n",
|
||
" <td>112.727501</td>\n",
|
||
" <td>112.727501</td>\n",
|
||
" <td>212403600</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-11</th>\n",
|
||
" <td>111.970001</td>\n",
|
||
" <td>112.482498</td>\n",
|
||
" <td>109.107498</td>\n",
|
||
" <td>109.375000</td>\n",
|
||
" <td>109.375000</td>\n",
|
||
" <td>187902400</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-12</th>\n",
|
||
" <td>110.497498</td>\n",
|
||
" <td>113.275002</td>\n",
|
||
" <td>110.297501</td>\n",
|
||
" <td>113.010002</td>\n",
|
||
" <td>113.010002</td>\n",
|
||
" <td>165944800</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-08-13</th>\n",
|
||
" <td>114.430000</td>\n",
|
||
" <td>116.042503</td>\n",
|
||
" <td>113.927498</td>\n",
|
||
" <td>115.010002</td>\n",
|
||
" <td>115.010002</td>\n",
|
||
" <td>210082000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Open High Low Close Adj Close \\\n",
|
||
"Date \n",
|
||
"2020-07-31 102.885002 106.415001 100.824997 106.260002 106.068756 \n",
|
||
"2020-08-03 108.199997 111.637497 107.892502 108.937500 108.741440 \n",
|
||
"2020-08-04 109.132500 110.790001 108.387497 109.665001 109.467628 \n",
|
||
"2020-08-05 109.377502 110.392502 108.897499 110.062500 109.864410 \n",
|
||
"2020-08-06 110.404999 114.412498 109.797501 113.902496 113.697502 \n",
|
||
"2020-08-07 113.205002 113.675003 110.292503 111.112503 111.112503 \n",
|
||
"2020-08-10 112.599998 113.775002 110.000000 112.727501 112.727501 \n",
|
||
"2020-08-11 111.970001 112.482498 109.107498 109.375000 109.375000 \n",
|
||
"2020-08-12 110.497498 113.275002 110.297501 113.010002 113.010002 \n",
|
||
"2020-08-13 114.430000 116.042503 113.927498 115.010002 115.010002 \n",
|
||
"\n",
|
||
" Volume \n",
|
||
"Date \n",
|
||
"2020-07-31 374336800 \n",
|
||
"2020-08-03 308151200 \n",
|
||
"2020-08-04 173071600 \n",
|
||
"2020-08-05 121992000 \n",
|
||
"2020-08-06 202428800 \n",
|
||
"2020-08-07 198045600 \n",
|
||
"2020-08-10 212403600 \n",
|
||
"2020-08-11 187902400 \n",
|
||
"2020-08-12 165944800 \n",
|
||
"2020-08-13 210082000 "
|
||
]
|
||
},
|
||
"execution_count": 40,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index[cond].query(\"Open > 100 and Volume > 110000000\").head(10)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "932bccc7",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 5. New columns"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "023d022d",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 5.1 Density Example"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 41,
|
||
"id": "83ebf03c",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>population</th>\n",
|
||
" <th>area</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>California</th>\n",
|
||
" <td>38332521</td>\n",
|
||
" <td>423967</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Texas</th>\n",
|
||
" <td>26448193</td>\n",
|
||
" <td>695662</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>New York</th>\n",
|
||
" <td>19651127</td>\n",
|
||
" <td>141297</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Florida</th>\n",
|
||
" <td>19552860</td>\n",
|
||
" <td>170312</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Illinois</th>\n",
|
||
" <td>12882135</td>\n",
|
||
" <td>149995</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" population area\n",
|
||
"California 38332521 423967\n",
|
||
"Texas 26448193 695662\n",
|
||
"New York 19651127 141297\n",
|
||
"Florida 19552860 170312\n",
|
||
"Illinois 12882135 149995"
|
||
]
|
||
},
|
||
"execution_count": 41,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"\n",
|
||
"population = pd.Series({'California': 38332521,\n",
|
||
" 'Texas': 26448193,\n",
|
||
" 'New York': 19651127,\n",
|
||
" 'Florida': 19552860,\n",
|
||
" 'Illinois': 12882135}\n",
|
||
")\n",
|
||
"\n",
|
||
"area = pd.Series({'California': 423967, \n",
|
||
" 'Texas': 695662, \n",
|
||
" 'New York': 141297,\n",
|
||
" 'Florida': 170312, \n",
|
||
" 'Illinois': 149995})\n",
|
||
"\n",
|
||
"states = pd.DataFrame( {'population': population,'area': area} )\n",
|
||
"states"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 42,
|
||
"id": "278f4672",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"states[\"density\"] = states[\"population\"] / states[\"area\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 43,
|
||
"id": "005b3ac5",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>population</th>\n",
|
||
" <th>area</th>\n",
|
||
" <th>density</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>California</th>\n",
|
||
" <td>38332521</td>\n",
|
||
" <td>423967</td>\n",
|
||
" <td>90.413926</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Texas</th>\n",
|
||
" <td>26448193</td>\n",
|
||
" <td>695662</td>\n",
|
||
" <td>38.018740</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>New York</th>\n",
|
||
" <td>19651127</td>\n",
|
||
" <td>141297</td>\n",
|
||
" <td>139.076746</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Florida</th>\n",
|
||
" <td>19552860</td>\n",
|
||
" <td>170312</td>\n",
|
||
" <td>114.806121</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Illinois</th>\n",
|
||
" <td>12882135</td>\n",
|
||
" <td>149995</td>\n",
|
||
" <td>85.883763</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" population area density\n",
|
||
"California 38332521 423967 90.413926\n",
|
||
"Texas 26448193 695662 38.018740\n",
|
||
"New York 19651127 141297 139.076746\n",
|
||
"Florida 19552860 170312 114.806121\n",
|
||
"Illinois 12882135 149995 85.883763"
|
||
]
|
||
},
|
||
"execution_count": 43,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"states"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "8526db19",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 5.2 Stocks example"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 44,
|
||
"id": "59008d49",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"aapl_proper_index[\"Percent Changes\"] = aapl_proper_index[\"Close\"].pct_change()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 45,
|
||
"id": "3115a2c6",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Open</th>\n",
|
||
" <th>High</th>\n",
|
||
" <th>Low</th>\n",
|
||
" <th>Close</th>\n",
|
||
" <th>Adj Close</th>\n",
|
||
" <th>Volume</th>\n",
|
||
" <th>Percent Changes</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2019-10-28</th>\n",
|
||
" <td>61.855000</td>\n",
|
||
" <td>62.312500</td>\n",
|
||
" <td>61.680000</td>\n",
|
||
" <td>62.262501</td>\n",
|
||
" <td>61.650810</td>\n",
|
||
" <td>96572800</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-10-29</th>\n",
|
||
" <td>62.242500</td>\n",
|
||
" <td>62.437500</td>\n",
|
||
" <td>60.642502</td>\n",
|
||
" <td>60.822498</td>\n",
|
||
" <td>60.224953</td>\n",
|
||
" <td>142839600</td>\n",
|
||
" <td>-0.023128</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-10-30</th>\n",
|
||
" <td>61.189999</td>\n",
|
||
" <td>61.325001</td>\n",
|
||
" <td>60.302502</td>\n",
|
||
" <td>60.814999</td>\n",
|
||
" <td>60.217525</td>\n",
|
||
" <td>124522000</td>\n",
|
||
" <td>-0.000123</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-10-31</th>\n",
|
||
" <td>61.810001</td>\n",
|
||
" <td>62.292500</td>\n",
|
||
" <td>59.314999</td>\n",
|
||
" <td>62.189999</td>\n",
|
||
" <td>61.579021</td>\n",
|
||
" <td>139162000</td>\n",
|
||
" <td>0.022610</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2019-11-01</th>\n",
|
||
" <td>62.384998</td>\n",
|
||
" <td>63.982498</td>\n",
|
||
" <td>62.290001</td>\n",
|
||
" <td>63.955002</td>\n",
|
||
" <td>63.326683</td>\n",
|
||
" <td>151125200</td>\n",
|
||
" <td>0.028381</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-21</th>\n",
|
||
" <td>116.669998</td>\n",
|
||
" <td>118.709999</td>\n",
|
||
" <td>116.449997</td>\n",
|
||
" <td>116.870003</td>\n",
|
||
" <td>116.870003</td>\n",
|
||
" <td>89946000</td>\n",
|
||
" <td>-0.005446</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-22</th>\n",
|
||
" <td>117.449997</td>\n",
|
||
" <td>118.040001</td>\n",
|
||
" <td>114.589996</td>\n",
|
||
" <td>115.750000</td>\n",
|
||
" <td>115.750000</td>\n",
|
||
" <td>101988000</td>\n",
|
||
" <td>-0.009583</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-23</th>\n",
|
||
" <td>116.389999</td>\n",
|
||
" <td>116.550003</td>\n",
|
||
" <td>114.279999</td>\n",
|
||
" <td>115.040001</td>\n",
|
||
" <td>115.040001</td>\n",
|
||
" <td>82572600</td>\n",
|
||
" <td>-0.006134</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-26</th>\n",
|
||
" <td>114.010002</td>\n",
|
||
" <td>116.550003</td>\n",
|
||
" <td>112.879997</td>\n",
|
||
" <td>115.050003</td>\n",
|
||
" <td>115.050003</td>\n",
|
||
" <td>111850700</td>\n",
|
||
" <td>0.000087</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2020-10-27</th>\n",
|
||
" <td>115.489998</td>\n",
|
||
" <td>117.279999</td>\n",
|
||
" <td>114.540001</td>\n",
|
||
" <td>116.599998</td>\n",
|
||
" <td>116.599998</td>\n",
|
||
" <td>91927700</td>\n",
|
||
" <td>0.013472</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>253 rows × 7 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Open High Low Close Adj Close \\\n",
|
||
"Date \n",
|
||
"2019-10-28 61.855000 62.312500 61.680000 62.262501 61.650810 \n",
|
||
"2019-10-29 62.242500 62.437500 60.642502 60.822498 60.224953 \n",
|
||
"2019-10-30 61.189999 61.325001 60.302502 60.814999 60.217525 \n",
|
||
"2019-10-31 61.810001 62.292500 59.314999 62.189999 61.579021 \n",
|
||
"2019-11-01 62.384998 63.982498 62.290001 63.955002 63.326683 \n",
|
||
"... ... ... ... ... ... \n",
|
||
"2020-10-21 116.669998 118.709999 116.449997 116.870003 116.870003 \n",
|
||
"2020-10-22 117.449997 118.040001 114.589996 115.750000 115.750000 \n",
|
||
"2020-10-23 116.389999 116.550003 114.279999 115.040001 115.040001 \n",
|
||
"2020-10-26 114.010002 116.550003 112.879997 115.050003 115.050003 \n",
|
||
"2020-10-27 115.489998 117.279999 114.540001 116.599998 116.599998 \n",
|
||
"\n",
|
||
" Volume Percent Changes \n",
|
||
"Date \n",
|
||
"2019-10-28 96572800 NaN \n",
|
||
"2019-10-29 142839600 -0.023128 \n",
|
||
"2019-10-30 124522000 -0.000123 \n",
|
||
"2019-10-31 139162000 0.022610 \n",
|
||
"2019-11-01 151125200 0.028381 \n",
|
||
"... ... ... \n",
|
||
"2020-10-21 89946000 -0.005446 \n",
|
||
"2020-10-22 101988000 -0.009583 \n",
|
||
"2020-10-23 82572600 -0.006134 \n",
|
||
"2020-10-26 111850700 0.000087 \n",
|
||
"2020-10-27 91927700 0.013472 \n",
|
||
"\n",
|
||
"[253 rows x 7 columns]"
|
||
]
|
||
},
|
||
"execution_count": 45,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "3279df01",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 6. Aggregation"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "9dec0caa",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 6.1 Basic operations"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 55,
|
||
"id": "876e6b85",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0.0028956964634767705"
|
||
]
|
||
},
|
||
"execution_count": 55,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index[\"Percent Changes\"].mean()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 51,
|
||
"id": "27c153e5",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0.11980826040056836"
|
||
]
|
||
},
|
||
"execution_count": 51,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index[\"Percent Changes\"].max()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 52,
|
||
"id": "23e031ff",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"-0.12864694751232164"
|
||
]
|
||
},
|
||
"execution_count": 52,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index[\"Percent Changes\"].min()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 56,
|
||
"id": "92849318",
|
||
"metadata": {
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0.0024045071214521263"
|
||
]
|
||
},
|
||
"execution_count": 56,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index[\"Percent Changes\"].median()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 53,
|
||
"id": "e6f50326",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0.01985402460478214"
|
||
]
|
||
},
|
||
"execution_count": 53,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index[aapl_proper_index[\"Percent Changes\"] > 0][\"Percent Changes\"].mean()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 57,
|
||
"id": "a3d0e96b",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"-0.01881547236798305"
|
||
]
|
||
},
|
||
"execution_count": 57,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"aapl_proper_index[aapl_proper_index[\"Percent Changes\"] < 0][\"Percent Changes\"].mean()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "fd7781d1",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 6.2 Grouping"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "64ec7f60",
|
||
"metadata": {},
|
||
"source": [
|
||
"We use planets discovery data as an example for grouping"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 96,
|
||
"id": "26b8b374",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"(1035, 6)"
|
||
]
|
||
},
|
||
"execution_count": 96,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"import seaborn as sns\n",
|
||
"planets = sns.load_dataset('planets')\n",
|
||
"planets.shape"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 97,
|
||
"id": "3c8426a5",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>method</th>\n",
|
||
" <th>number</th>\n",
|
||
" <th>orbital_period</th>\n",
|
||
" <th>mass</th>\n",
|
||
" <th>distance</th>\n",
|
||
" <th>year</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Radial Velocity</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>269.300000</td>\n",
|
||
" <td>7.10</td>\n",
|
||
" <td>77.40</td>\n",
|
||
" <td>2006</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Radial Velocity</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>874.774000</td>\n",
|
||
" <td>2.21</td>\n",
|
||
" <td>56.95</td>\n",
|
||
" <td>2008</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>Radial Velocity</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>763.000000</td>\n",
|
||
" <td>2.60</td>\n",
|
||
" <td>19.84</td>\n",
|
||
" <td>2011</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>Radial Velocity</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>326.030000</td>\n",
|
||
" <td>19.40</td>\n",
|
||
" <td>110.62</td>\n",
|
||
" <td>2007</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>Radial Velocity</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>516.220000</td>\n",
|
||
" <td>10.50</td>\n",
|
||
" <td>119.47</td>\n",
|
||
" <td>2009</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1030</th>\n",
|
||
" <td>Transit</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>3.941507</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>172.00</td>\n",
|
||
" <td>2006</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1031</th>\n",
|
||
" <td>Transit</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>2.615864</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>148.00</td>\n",
|
||
" <td>2007</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1032</th>\n",
|
||
" <td>Transit</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>3.191524</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>174.00</td>\n",
|
||
" <td>2007</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1033</th>\n",
|
||
" <td>Transit</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>4.125083</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>293.00</td>\n",
|
||
" <td>2008</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1034</th>\n",
|
||
" <td>Transit</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>4.187757</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>260.00</td>\n",
|
||
" <td>2008</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>1035 rows × 6 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" method number orbital_period mass distance year\n",
|
||
"0 Radial Velocity 1 269.300000 7.10 77.40 2006\n",
|
||
"1 Radial Velocity 1 874.774000 2.21 56.95 2008\n",
|
||
"2 Radial Velocity 1 763.000000 2.60 19.84 2011\n",
|
||
"3 Radial Velocity 1 326.030000 19.40 110.62 2007\n",
|
||
"4 Radial Velocity 1 516.220000 10.50 119.47 2009\n",
|
||
"... ... ... ... ... ... ...\n",
|
||
"1030 Transit 1 3.941507 NaN 172.00 2006\n",
|
||
"1031 Transit 1 2.615864 NaN 148.00 2007\n",
|
||
"1032 Transit 1 3.191524 NaN 174.00 2007\n",
|
||
"1033 Transit 1 4.125083 NaN 293.00 2008\n",
|
||
"1034 Transit 1 4.187757 NaN 260.00 2008\n",
|
||
"\n",
|
||
"[1035 rows x 6 columns]"
|
||
]
|
||
},
|
||
"execution_count": 97,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"planets"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 98,
|
||
"id": "9872086c",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"method\n",
|
||
"Astrometry 631.180000\n",
|
||
"Eclipse Timing Variations 4343.500000\n",
|
||
"Imaging 27500.000000\n",
|
||
"Microlensing 3300.000000\n",
|
||
"Orbital Brightness Modulation 0.342887\n",
|
||
"Pulsar Timing 66.541900\n",
|
||
"Pulsation Timing Variations 1170.000000\n",
|
||
"Radial Velocity 360.200000\n",
|
||
"Transit 5.714932\n",
|
||
"Transit Timing Variations 57.011000\n",
|
||
"Name: orbital_period, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 98,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"planets.groupby('method')['orbital_period'].median()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 101,
|
||
"id": "8399a771",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>count</th>\n",
|
||
" <th>mean</th>\n",
|
||
" <th>std</th>\n",
|
||
" <th>min</th>\n",
|
||
" <th>25%</th>\n",
|
||
" <th>50%</th>\n",
|
||
" <th>75%</th>\n",
|
||
" <th>max</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>method</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>Astrometry</th>\n",
|
||
" <td>2.0</td>\n",
|
||
" <td>631.180000</td>\n",
|
||
" <td>544.217663</td>\n",
|
||
" <td>246.360000</td>\n",
|
||
" <td>438.770000</td>\n",
|
||
" <td>631.180000</td>\n",
|
||
" <td>823.590000</td>\n",
|
||
" <td>1016.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Eclipse Timing Variations</th>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>4751.644444</td>\n",
|
||
" <td>2499.130945</td>\n",
|
||
" <td>1916.250000</td>\n",
|
||
" <td>2900.000000</td>\n",
|
||
" <td>4343.500000</td>\n",
|
||
" <td>5767.000000</td>\n",
|
||
" <td>10220.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Imaging</th>\n",
|
||
" <td>12.0</td>\n",
|
||
" <td>118247.737500</td>\n",
|
||
" <td>213978.177277</td>\n",
|
||
" <td>4639.150000</td>\n",
|
||
" <td>8343.900000</td>\n",
|
||
" <td>27500.000000</td>\n",
|
||
" <td>94250.000000</td>\n",
|
||
" <td>730000.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Microlensing</th>\n",
|
||
" <td>7.0</td>\n",
|
||
" <td>3153.571429</td>\n",
|
||
" <td>1113.166333</td>\n",
|
||
" <td>1825.000000</td>\n",
|
||
" <td>2375.000000</td>\n",
|
||
" <td>3300.000000</td>\n",
|
||
" <td>3550.000000</td>\n",
|
||
" <td>5100.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Orbital Brightness Modulation</th>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>0.709307</td>\n",
|
||
" <td>0.725493</td>\n",
|
||
" <td>0.240104</td>\n",
|
||
" <td>0.291496</td>\n",
|
||
" <td>0.342887</td>\n",
|
||
" <td>0.943908</td>\n",
|
||
" <td>1.544929</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Pulsar Timing</th>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>7343.021201</td>\n",
|
||
" <td>16313.265573</td>\n",
|
||
" <td>0.090706</td>\n",
|
||
" <td>25.262000</td>\n",
|
||
" <td>66.541900</td>\n",
|
||
" <td>98.211400</td>\n",
|
||
" <td>36525.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Pulsation Timing Variations</th>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>1170.000000</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>1170.000000</td>\n",
|
||
" <td>1170.000000</td>\n",
|
||
" <td>1170.000000</td>\n",
|
||
" <td>1170.000000</td>\n",
|
||
" <td>1170.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Radial Velocity</th>\n",
|
||
" <td>553.0</td>\n",
|
||
" <td>823.354680</td>\n",
|
||
" <td>1454.926210</td>\n",
|
||
" <td>0.736540</td>\n",
|
||
" <td>38.021000</td>\n",
|
||
" <td>360.200000</td>\n",
|
||
" <td>982.000000</td>\n",
|
||
" <td>17337.500000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Transit</th>\n",
|
||
" <td>397.0</td>\n",
|
||
" <td>21.102073</td>\n",
|
||
" <td>46.185893</td>\n",
|
||
" <td>0.355000</td>\n",
|
||
" <td>3.160630</td>\n",
|
||
" <td>5.714932</td>\n",
|
||
" <td>16.145700</td>\n",
|
||
" <td>331.600590</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Transit Timing Variations</th>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>79.783500</td>\n",
|
||
" <td>71.599884</td>\n",
|
||
" <td>22.339500</td>\n",
|
||
" <td>39.675250</td>\n",
|
||
" <td>57.011000</td>\n",
|
||
" <td>108.505500</td>\n",
|
||
" <td>160.000000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" count mean std \\\n",
|
||
"method \n",
|
||
"Astrometry 2.0 631.180000 544.217663 \n",
|
||
"Eclipse Timing Variations 9.0 4751.644444 2499.130945 \n",
|
||
"Imaging 12.0 118247.737500 213978.177277 \n",
|
||
"Microlensing 7.0 3153.571429 1113.166333 \n",
|
||
"Orbital Brightness Modulation 3.0 0.709307 0.725493 \n",
|
||
"Pulsar Timing 5.0 7343.021201 16313.265573 \n",
|
||
"Pulsation Timing Variations 1.0 1170.000000 NaN \n",
|
||
"Radial Velocity 553.0 823.354680 1454.926210 \n",
|
||
"Transit 397.0 21.102073 46.185893 \n",
|
||
"Transit Timing Variations 3.0 79.783500 71.599884 \n",
|
||
"\n",
|
||
" min 25% 50% \\\n",
|
||
"method \n",
|
||
"Astrometry 246.360000 438.770000 631.180000 \n",
|
||
"Eclipse Timing Variations 1916.250000 2900.000000 4343.500000 \n",
|
||
"Imaging 4639.150000 8343.900000 27500.000000 \n",
|
||
"Microlensing 1825.000000 2375.000000 3300.000000 \n",
|
||
"Orbital Brightness Modulation 0.240104 0.291496 0.342887 \n",
|
||
"Pulsar Timing 0.090706 25.262000 66.541900 \n",
|
||
"Pulsation Timing Variations 1170.000000 1170.000000 1170.000000 \n",
|
||
"Radial Velocity 0.736540 38.021000 360.200000 \n",
|
||
"Transit 0.355000 3.160630 5.714932 \n",
|
||
"Transit Timing Variations 22.339500 39.675250 57.011000 \n",
|
||
"\n",
|
||
" 75% max \n",
|
||
"method \n",
|
||
"Astrometry 823.590000 1016.000000 \n",
|
||
"Eclipse Timing Variations 5767.000000 10220.000000 \n",
|
||
"Imaging 94250.000000 730000.000000 \n",
|
||
"Microlensing 3550.000000 5100.000000 \n",
|
||
"Orbital Brightness Modulation 0.943908 1.544929 \n",
|
||
"Pulsar Timing 98.211400 36525.000000 \n",
|
||
"Pulsation Timing Variations 1170.000000 1170.000000 \n",
|
||
"Radial Velocity 982.000000 17337.500000 \n",
|
||
"Transit 16.145700 331.600590 \n",
|
||
"Transit Timing Variations 108.505500 160.000000 "
|
||
]
|
||
},
|
||
"execution_count": 101,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"planets.groupby('method')['orbital_period'].describe()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 100,
|
||
"id": "4cd98382",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"method\n",
|
||
"Astrometry 2\n",
|
||
"Eclipse Timing Variations 9\n",
|
||
"Imaging 38\n",
|
||
"Microlensing 23\n",
|
||
"Orbital Brightness Modulation 3\n",
|
||
"Pulsar Timing 5\n",
|
||
"Pulsation Timing Variations 1\n",
|
||
"Radial Velocity 553\n",
|
||
"Transit 397\n",
|
||
"Transit Timing Variations 4\n",
|
||
"Name: number, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 100,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"planets.groupby('method')[\"number\"].count()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "1430d995",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 7. Joining Data"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "c3abca64",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 7.1 Merge (or join)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 106,
|
||
"id": "de4cab14",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>employee</th>\n",
|
||
" <th>group</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Bob</td>\n",
|
||
" <td>Accounting</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Jake</td>\n",
|
||
" <td>Engineering</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>Lisa</td>\n",
|
||
" <td>Engineering</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>Sue</td>\n",
|
||
" <td>HR</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" employee group\n",
|
||
"0 Bob Accounting\n",
|
||
"1 Jake Engineering\n",
|
||
"2 Lisa Engineering\n",
|
||
"3 Sue HR"
|
||
]
|
||
},
|
||
"execution_count": 106,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"department = pd.DataFrame({'employee': ['Bob', 'Jake', 'Lisa', 'Sue'],\n",
|
||
" 'group': ['Accounting', 'Engineering', 'Engineering', 'HR']})\n",
|
||
"\n",
|
||
"department"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 107,
|
||
"id": "6301c1f2",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>employee</th>\n",
|
||
" <th>hire_date</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Lisa</td>\n",
|
||
" <td>2004</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Bob</td>\n",
|
||
" <td>2008</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>Jake</td>\n",
|
||
" <td>2012</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>Sue</td>\n",
|
||
" <td>2014</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" employee hire_date\n",
|
||
"0 Lisa 2004\n",
|
||
"1 Bob 2008\n",
|
||
"2 Jake 2012\n",
|
||
"3 Sue 2014"
|
||
]
|
||
},
|
||
"execution_count": 107,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"hire_date = pd.DataFrame({'employee': ['Lisa', 'Bob', 'Jake', 'Sue'],\n",
|
||
" 'hire_date': [2004, 2008, 2012, 2014]})\n",
|
||
"\n",
|
||
"hire_date"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 111,
|
||
"id": "7a4f923b",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>employee</th>\n",
|
||
" <th>group</th>\n",
|
||
" <th>hire_date</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Bob</td>\n",
|
||
" <td>Accounting</td>\n",
|
||
" <td>2008</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Jake</td>\n",
|
||
" <td>Engineering</td>\n",
|
||
" <td>2012</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>Lisa</td>\n",
|
||
" <td>Engineering</td>\n",
|
||
" <td>2004</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>Sue</td>\n",
|
||
" <td>HR</td>\n",
|
||
" <td>2014</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" employee group hire_date\n",
|
||
"0 Bob Accounting 2008\n",
|
||
"1 Jake Engineering 2012\n",
|
||
"2 Lisa Engineering 2004\n",
|
||
"3 Sue HR 2014"
|
||
]
|
||
},
|
||
"execution_count": 111,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"employee = pd.merge(department, hire_date)\n",
|
||
"employee"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 112,
|
||
"id": "0bab79da",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>employee</th>\n",
|
||
" <th>group</th>\n",
|
||
" <th>hire_date</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Bob</td>\n",
|
||
" <td>Accounting</td>\n",
|
||
" <td>2008</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Jake</td>\n",
|
||
" <td>Engineering</td>\n",
|
||
" <td>2012</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>Lisa</td>\n",
|
||
" <td>Engineering</td>\n",
|
||
" <td>2004</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>Sue</td>\n",
|
||
" <td>HR</td>\n",
|
||
" <td>2014</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" employee group hire_date\n",
|
||
"0 Bob Accounting 2008\n",
|
||
"1 Jake Engineering 2012\n",
|
||
"2 Lisa Engineering 2004\n",
|
||
"3 Sue HR 2014"
|
||
]
|
||
},
|
||
"execution_count": 112,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"employee = pd.merge(department, hire_date, on=\"employee\")\n",
|
||
"employee"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 113,
|
||
"id": "29d1d5e0",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>name</th>\n",
|
||
" <th>salary</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Bob</td>\n",
|
||
" <td>70000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Jake</td>\n",
|
||
" <td>80000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>Lisa</td>\n",
|
||
" <td>120000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>Sue</td>\n",
|
||
" <td>90000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" name salary\n",
|
||
"0 Bob 70000\n",
|
||
"1 Jake 80000\n",
|
||
"2 Lisa 120000\n",
|
||
"3 Sue 90000"
|
||
]
|
||
},
|
||
"execution_count": 113,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"salary = pd.DataFrame({'name': ['Bob', 'Jake', 'Lisa', 'Sue'],\n",
|
||
" 'salary': [70000, 80000, 120000, 90000]})\n",
|
||
"salary"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 118,
|
||
"id": "200d57d0",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>employee</th>\n",
|
||
" <th>group</th>\n",
|
||
" <th>name</th>\n",
|
||
" <th>salary</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Bob</td>\n",
|
||
" <td>Accounting</td>\n",
|
||
" <td>Bob</td>\n",
|
||
" <td>70000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Jake</td>\n",
|
||
" <td>Engineering</td>\n",
|
||
" <td>Jake</td>\n",
|
||
" <td>80000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>Lisa</td>\n",
|
||
" <td>Engineering</td>\n",
|
||
" <td>Lisa</td>\n",
|
||
" <td>120000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>Sue</td>\n",
|
||
" <td>HR</td>\n",
|
||
" <td>Sue</td>\n",
|
||
" <td>90000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" employee group name salary\n",
|
||
"0 Bob Accounting Bob 70000\n",
|
||
"1 Jake Engineering Jake 80000\n",
|
||
"2 Lisa Engineering Lisa 120000\n",
|
||
"3 Sue HR Sue 90000"
|
||
]
|
||
},
|
||
"execution_count": 118,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"employee = pd.merge(department, salary, left_on=\"employee\", right_on=\"name\")\n",
|
||
"employee"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 119,
|
||
"id": "cd98a441",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>employee</th>\n",
|
||
" <th>group</th>\n",
|
||
" <th>salary</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Bob</td>\n",
|
||
" <td>Accounting</td>\n",
|
||
" <td>70000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Jake</td>\n",
|
||
" <td>Engineering</td>\n",
|
||
" <td>80000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>Lisa</td>\n",
|
||
" <td>Engineering</td>\n",
|
||
" <td>120000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>Sue</td>\n",
|
||
" <td>HR</td>\n",
|
||
" <td>90000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" employee group salary\n",
|
||
"0 Bob Accounting 70000\n",
|
||
"1 Jake Engineering 80000\n",
|
||
"2 Lisa Engineering 120000\n",
|
||
"3 Sue HR 90000"
|
||
]
|
||
},
|
||
"execution_count": 119,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"employee = employee.drop('name',axis=1)\n",
|
||
"employee"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "3762163c",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 7.2 one to many merging"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 121,
|
||
"id": "b1aa1315",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>group</th>\n",
|
||
" <th>supervisor</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Accounting</td>\n",
|
||
" <td>Carly</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Engineering</td>\n",
|
||
" <td>Guido</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>HR</td>\n",
|
||
" <td>Steve</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" group supervisor\n",
|
||
"0 Accounting Carly\n",
|
||
"1 Engineering Guido\n",
|
||
"2 HR Steve"
|
||
]
|
||
},
|
||
"execution_count": 121,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"supervisor = pd.DataFrame({'group': ['Accounting', 'Engineering', 'HR'],\n",
|
||
" 'supervisor': ['Carly', 'Guido', 'Steve']})\n",
|
||
"supervisor"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 126,
|
||
"id": "b01c9052",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>employee</th>\n",
|
||
" <th>group</th>\n",
|
||
" <th>salary</th>\n",
|
||
" <th>supervisor</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Bob</td>\n",
|
||
" <td>Accounting</td>\n",
|
||
" <td>70000</td>\n",
|
||
" <td>Carly</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Jake</td>\n",
|
||
" <td>Engineering</td>\n",
|
||
" <td>80000</td>\n",
|
||
" <td>Guido</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>Lisa</td>\n",
|
||
" <td>Engineering</td>\n",
|
||
" <td>120000</td>\n",
|
||
" <td>Guido</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>Sue</td>\n",
|
||
" <td>HR</td>\n",
|
||
" <td>90000</td>\n",
|
||
" <td>Steve</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" employee group salary supervisor\n",
|
||
"0 Bob Accounting 70000 Carly\n",
|
||
"1 Jake Engineering 80000 Guido\n",
|
||
"2 Lisa Engineering 120000 Guido\n",
|
||
"3 Sue HR 90000 Steve"
|
||
]
|
||
},
|
||
"execution_count": 126,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.merge(employee,supervisor)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "211fc106",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 7.3 Many to Many merging"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 128,
|
||
"id": "87fb2052",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"skills = pd.DataFrame({'group': ['Accounting', 'Accounting','Engineering', \n",
|
||
" 'Engineering', 'HR', 'HR'],\n",
|
||
" 'skills': ['math', 'spreadsheets', 'coding', \n",
|
||
" 'linux','spreadsheets', 'organization']})"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 129,
|
||
"id": "e1d7ebc4",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>employee</th>\n",
|
||
" <th>group</th>\n",
|
||
" <th>salary</th>\n",
|
||
" <th>skills</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Bob</td>\n",
|
||
" <td>Accounting</td>\n",
|
||
" <td>70000</td>\n",
|
||
" <td>math</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Bob</td>\n",
|
||
" <td>Accounting</td>\n",
|
||
" <td>70000</td>\n",
|
||
" <td>spreadsheets</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>Jake</td>\n",
|
||
" <td>Engineering</td>\n",
|
||
" <td>80000</td>\n",
|
||
" <td>coding</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>Jake</td>\n",
|
||
" <td>Engineering</td>\n",
|
||
" <td>80000</td>\n",
|
||
" <td>linux</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>Lisa</td>\n",
|
||
" <td>Engineering</td>\n",
|
||
" <td>120000</td>\n",
|
||
" <td>coding</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>Lisa</td>\n",
|
||
" <td>Engineering</td>\n",
|
||
" <td>120000</td>\n",
|
||
" <td>linux</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>Sue</td>\n",
|
||
" <td>HR</td>\n",
|
||
" <td>90000</td>\n",
|
||
" <td>spreadsheets</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>Sue</td>\n",
|
||
" <td>HR</td>\n",
|
||
" <td>90000</td>\n",
|
||
" <td>organization</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" employee group salary skills\n",
|
||
"0 Bob Accounting 70000 math\n",
|
||
"1 Bob Accounting 70000 spreadsheets\n",
|
||
"2 Jake Engineering 80000 coding\n",
|
||
"3 Jake Engineering 80000 linux\n",
|
||
"4 Lisa Engineering 120000 coding\n",
|
||
"5 Lisa Engineering 120000 linux\n",
|
||
"6 Sue HR 90000 spreadsheets\n",
|
||
"7 Sue HR 90000 organization"
|
||
]
|
||
},
|
||
"execution_count": 129,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.merge(employee,skills)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "ec89a9e8",
|
||
"metadata": {},
|
||
"source": [
|
||
"It's a very strange set of data. Make sure you know how to use it for many-to-many merging"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "1d976a18",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 7.4 Inner Join / Outer Join / Left Join / Right Join"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 132,
|
||
"id": "1c7edc09",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>name</th>\n",
|
||
" <th>food</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Peter</td>\n",
|
||
" <td>fish</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Paul</td>\n",
|
||
" <td>beans</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>Mary</td>\n",
|
||
" <td>bread</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" name food\n",
|
||
"0 Peter fish\n",
|
||
"1 Paul beans\n",
|
||
"2 Mary bread"
|
||
]
|
||
},
|
||
"execution_count": 132,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"fav_food = pd.DataFrame({'name': ['Peter', 'Paul', 'Mary'],\n",
|
||
" 'food': ['fish', 'beans', 'bread']},\n",
|
||
" columns=['name', 'food'])\n",
|
||
"fav_food"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 134,
|
||
"id": "74aa124b",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>name</th>\n",
|
||
" <th>drink</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Mary</td>\n",
|
||
" <td>wine</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Joseph</td>\n",
|
||
" <td>beer</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" name drink\n",
|
||
"0 Mary wine\n",
|
||
"1 Joseph beer"
|
||
]
|
||
},
|
||
"execution_count": 134,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"fav_drink = pd.DataFrame({'name': ['Mary', 'Joseph'],\n",
|
||
" 'drink': ['wine', 'beer']},\n",
|
||
" columns=['name', 'drink'])\n",
|
||
"fav_drink"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 135,
|
||
"id": "48886a2f",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>name</th>\n",
|
||
" <th>food</th>\n",
|
||
" <th>drink</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Mary</td>\n",
|
||
" <td>bread</td>\n",
|
||
" <td>wine</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" name food drink\n",
|
||
"0 Mary bread wine"
|
||
]
|
||
},
|
||
"execution_count": 135,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.merge(fav_food, fav_drink)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 136,
|
||
"id": "d5e4c0e9",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>name</th>\n",
|
||
" <th>food</th>\n",
|
||
" <th>drink</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Peter</td>\n",
|
||
" <td>fish</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Paul</td>\n",
|
||
" <td>beans</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>Mary</td>\n",
|
||
" <td>bread</td>\n",
|
||
" <td>wine</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>Joseph</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>beer</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" name food drink\n",
|
||
"0 Peter fish NaN\n",
|
||
"1 Paul beans NaN\n",
|
||
"2 Mary bread wine\n",
|
||
"3 Joseph NaN beer"
|
||
]
|
||
},
|
||
"execution_count": 136,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.merge(fav_food, fav_drink, how=\"outer\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 137,
|
||
"id": "84ab9843",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>name</th>\n",
|
||
" <th>food</th>\n",
|
||
" <th>drink</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Peter</td>\n",
|
||
" <td>fish</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Paul</td>\n",
|
||
" <td>beans</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>Mary</td>\n",
|
||
" <td>bread</td>\n",
|
||
" <td>wine</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" name food drink\n",
|
||
"0 Peter fish NaN\n",
|
||
"1 Paul beans NaN\n",
|
||
"2 Mary bread wine"
|
||
]
|
||
},
|
||
"execution_count": 137,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.merge(fav_food, fav_drink, how=\"left\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 138,
|
||
"id": "3d11be90",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>name</th>\n",
|
||
" <th>food</th>\n",
|
||
" <th>drink</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Mary</td>\n",
|
||
" <td>bread</td>\n",
|
||
" <td>wine</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Joseph</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>beer</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" name food drink\n",
|
||
"0 Mary bread wine\n",
|
||
"1 Joseph NaN beer"
|
||
]
|
||
},
|
||
"execution_count": 138,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.merge(fav_food, fav_drink, how=\"right\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d8e16b5b",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 8. Handling Missing Data"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 71,
|
||
"id": "e782f9ea",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"hibor = pd.read_csv(\"hibor.csv\", parse_dates=True, index_col='date')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 72,
|
||
"id": "04401df8",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>overnight</th>\n",
|
||
" <th>1 week</th>\n",
|
||
" <th>2 weeks</th>\n",
|
||
" <th>1 months</th>\n",
|
||
" <th>2 months</th>\n",
|
||
" <th>3 months</th>\n",
|
||
" <th>6 months</th>\n",
|
||
" <th>12 months</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-01</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-02</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-03</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-04</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.05000</td>\n",
|
||
" <td>0.07964</td>\n",
|
||
" <td>0.11893</td>\n",
|
||
" <td>0.15679</td>\n",
|
||
" <td>0.31571</td>\n",
|
||
" <td>0.71429</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-05</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.05000</td>\n",
|
||
" <td>0.07964</td>\n",
|
||
" <td>0.11000</td>\n",
|
||
" <td>0.15000</td>\n",
|
||
" <td>0.29929</td>\n",
|
||
" <td>0.68929</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-06</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04900</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.08000</td>\n",
|
||
" <td>0.11000</td>\n",
|
||
" <td>0.14000</td>\n",
|
||
" <td>0.28000</td>\n",
|
||
" <td>0.66857</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-07</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06964</td>\n",
|
||
" <td>0.10000</td>\n",
|
||
" <td>0.13000</td>\n",
|
||
" <td>0.26000</td>\n",
|
||
" <td>0.62857</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-08</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06964</td>\n",
|
||
" <td>0.10000</td>\n",
|
||
" <td>0.13000</td>\n",
|
||
" <td>0.26000</td>\n",
|
||
" <td>0.62857</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-09</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-10</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-11</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06036</td>\n",
|
||
" <td>0.09000</td>\n",
|
||
" <td>0.12000</td>\n",
|
||
" <td>0.24000</td>\n",
|
||
" <td>0.57000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" overnight 1 week 2 weeks 1 months 2 months 3 months \\\n",
|
||
"date \n",
|
||
"2010-01-01 NaN NaN NaN NaN NaN NaN \n",
|
||
"2010-01-02 NaN NaN NaN NaN NaN NaN \n",
|
||
"2010-01-03 NaN NaN NaN NaN NaN NaN \n",
|
||
"2010-01-04 0.03 0.04971 0.05000 0.07964 0.11893 0.15679 \n",
|
||
"2010-01-05 0.03 0.04971 0.05000 0.07964 0.11000 0.15000 \n",
|
||
"2010-01-06 0.03 0.04900 0.04971 0.08000 0.11000 0.14000 \n",
|
||
"2010-01-07 0.03 0.04971 0.04971 0.06964 0.10000 0.13000 \n",
|
||
"2010-01-08 0.03 0.04971 0.04971 0.06964 0.10000 0.13000 \n",
|
||
"2010-01-09 NaN NaN NaN NaN NaN NaN \n",
|
||
"2010-01-10 NaN NaN NaN NaN NaN NaN \n",
|
||
"2010-01-11 0.03 0.04971 0.04971 0.06036 0.09000 0.12000 \n",
|
||
"\n",
|
||
" 6 months 12 months \n",
|
||
"date \n",
|
||
"2010-01-01 NaN NaN \n",
|
||
"2010-01-02 NaN NaN \n",
|
||
"2010-01-03 NaN NaN \n",
|
||
"2010-01-04 0.31571 0.71429 \n",
|
||
"2010-01-05 0.29929 0.68929 \n",
|
||
"2010-01-06 0.28000 0.66857 \n",
|
||
"2010-01-07 0.26000 0.62857 \n",
|
||
"2010-01-08 0.26000 0.62857 \n",
|
||
"2010-01-09 NaN NaN \n",
|
||
"2010-01-10 NaN NaN \n",
|
||
"2010-01-11 0.24000 0.57000 "
|
||
]
|
||
},
|
||
"execution_count": 72,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"hibor"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "a880cff4",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 8.1 Check missing data"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 77,
|
||
"id": "6fc86cdc",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>overnight</th>\n",
|
||
" <th>1 week</th>\n",
|
||
" <th>2 weeks</th>\n",
|
||
" <th>1 months</th>\n",
|
||
" <th>2 months</th>\n",
|
||
" <th>3 months</th>\n",
|
||
" <th>6 months</th>\n",
|
||
" <th>12 months</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-01</th>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-02</th>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-03</th>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-04</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-05</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-06</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-07</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-08</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-09</th>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-10</th>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-11</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" overnight 1 week 2 weeks 1 months 2 months 3 months \\\n",
|
||
"date \n",
|
||
"2010-01-01 True True True True True True \n",
|
||
"2010-01-02 True True True True True True \n",
|
||
"2010-01-03 True True True True True True \n",
|
||
"2010-01-04 False False False False False False \n",
|
||
"2010-01-05 False False False False False False \n",
|
||
"2010-01-06 False False False False False False \n",
|
||
"2010-01-07 False False False False False False \n",
|
||
"2010-01-08 False False False False False False \n",
|
||
"2010-01-09 True True True True True True \n",
|
||
"2010-01-10 True True True True True True \n",
|
||
"2010-01-11 False False False False False False \n",
|
||
"\n",
|
||
" 6 months 12 months \n",
|
||
"date \n",
|
||
"2010-01-01 True True \n",
|
||
"2010-01-02 True True \n",
|
||
"2010-01-03 True True \n",
|
||
"2010-01-04 False False \n",
|
||
"2010-01-05 False False \n",
|
||
"2010-01-06 False False \n",
|
||
"2010-01-07 False False \n",
|
||
"2010-01-08 False False \n",
|
||
"2010-01-09 True True \n",
|
||
"2010-01-10 True True \n",
|
||
"2010-01-11 False False "
|
||
]
|
||
},
|
||
"execution_count": 77,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"hibor.isnull()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 79,
|
||
"id": "84d0ac6f",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"True"
|
||
]
|
||
},
|
||
"execution_count": 79,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"hibor.isnull().values.any()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 80,
|
||
"id": "fc5a7223",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"True"
|
||
]
|
||
},
|
||
"execution_count": 80,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"hibor[\"overnight\"].isnull().values.any()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 84,
|
||
"id": "87d7e12f",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"False 6\n",
|
||
"True 5\n",
|
||
"Name: overnight, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 84,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"hibor[\"overnight\"].isnull().value_counts()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 85,
|
||
"id": "d3acd78f",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"date\n",
|
||
"2010-01-01 NaN\n",
|
||
"2010-01-02 NaN\n",
|
||
"2010-01-03 NaN\n",
|
||
"2010-01-09 NaN\n",
|
||
"2010-01-10 NaN\n",
|
||
"Name: overnight, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 85,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"hibor[\"overnight\"][hibor[\"overnight\"].isnull()]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "27237840",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 8.2 Drop Data"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 87,
|
||
"id": "f550b706",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>overnight</th>\n",
|
||
" <th>1 week</th>\n",
|
||
" <th>2 weeks</th>\n",
|
||
" <th>1 months</th>\n",
|
||
" <th>2 months</th>\n",
|
||
" <th>3 months</th>\n",
|
||
" <th>6 months</th>\n",
|
||
" <th>12 months</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-04</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.05000</td>\n",
|
||
" <td>0.07964</td>\n",
|
||
" <td>0.11893</td>\n",
|
||
" <td>0.15679</td>\n",
|
||
" <td>0.31571</td>\n",
|
||
" <td>0.71429</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-05</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.05000</td>\n",
|
||
" <td>0.07964</td>\n",
|
||
" <td>0.11000</td>\n",
|
||
" <td>0.15000</td>\n",
|
||
" <td>0.29929</td>\n",
|
||
" <td>0.68929</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-06</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04900</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.08000</td>\n",
|
||
" <td>0.11000</td>\n",
|
||
" <td>0.14000</td>\n",
|
||
" <td>0.28000</td>\n",
|
||
" <td>0.66857</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-07</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06964</td>\n",
|
||
" <td>0.10000</td>\n",
|
||
" <td>0.13000</td>\n",
|
||
" <td>0.26000</td>\n",
|
||
" <td>0.62857</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-08</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06964</td>\n",
|
||
" <td>0.10000</td>\n",
|
||
" <td>0.13000</td>\n",
|
||
" <td>0.26000</td>\n",
|
||
" <td>0.62857</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-11</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06036</td>\n",
|
||
" <td>0.09000</td>\n",
|
||
" <td>0.12000</td>\n",
|
||
" <td>0.24000</td>\n",
|
||
" <td>0.57000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" overnight 1 week 2 weeks 1 months 2 months 3 months \\\n",
|
||
"date \n",
|
||
"2010-01-04 0.03 0.04971 0.05000 0.07964 0.11893 0.15679 \n",
|
||
"2010-01-05 0.03 0.04971 0.05000 0.07964 0.11000 0.15000 \n",
|
||
"2010-01-06 0.03 0.04900 0.04971 0.08000 0.11000 0.14000 \n",
|
||
"2010-01-07 0.03 0.04971 0.04971 0.06964 0.10000 0.13000 \n",
|
||
"2010-01-08 0.03 0.04971 0.04971 0.06964 0.10000 0.13000 \n",
|
||
"2010-01-11 0.03 0.04971 0.04971 0.06036 0.09000 0.12000 \n",
|
||
"\n",
|
||
" 6 months 12 months \n",
|
||
"date \n",
|
||
"2010-01-04 0.31571 0.71429 \n",
|
||
"2010-01-05 0.29929 0.68929 \n",
|
||
"2010-01-06 0.28000 0.66857 \n",
|
||
"2010-01-07 0.26000 0.62857 \n",
|
||
"2010-01-08 0.26000 0.62857 \n",
|
||
"2010-01-11 0.24000 0.57000 "
|
||
]
|
||
},
|
||
"execution_count": 87,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"hibor.dropna()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "3f0d7ba1",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 8.3 Fill with specific values"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "4033679b",
|
||
"metadata": {},
|
||
"source": [
|
||
"Notes: Just show as an example. Does not make sense in this scenario"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 90,
|
||
"id": "0ac6a858",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>overnight</th>\n",
|
||
" <th>1 week</th>\n",
|
||
" <th>2 weeks</th>\n",
|
||
" <th>1 months</th>\n",
|
||
" <th>2 months</th>\n",
|
||
" <th>3 months</th>\n",
|
||
" <th>6 months</th>\n",
|
||
" <th>12 months</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-01</th>\n",
|
||
" <td>0.00</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-02</th>\n",
|
||
" <td>0.00</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-03</th>\n",
|
||
" <td>0.00</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-04</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.05000</td>\n",
|
||
" <td>0.07964</td>\n",
|
||
" <td>0.11893</td>\n",
|
||
" <td>0.15679</td>\n",
|
||
" <td>0.31571</td>\n",
|
||
" <td>0.71429</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-05</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.05000</td>\n",
|
||
" <td>0.07964</td>\n",
|
||
" <td>0.11000</td>\n",
|
||
" <td>0.15000</td>\n",
|
||
" <td>0.29929</td>\n",
|
||
" <td>0.68929</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-06</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04900</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.08000</td>\n",
|
||
" <td>0.11000</td>\n",
|
||
" <td>0.14000</td>\n",
|
||
" <td>0.28000</td>\n",
|
||
" <td>0.66857</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-07</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06964</td>\n",
|
||
" <td>0.10000</td>\n",
|
||
" <td>0.13000</td>\n",
|
||
" <td>0.26000</td>\n",
|
||
" <td>0.62857</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-08</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06964</td>\n",
|
||
" <td>0.10000</td>\n",
|
||
" <td>0.13000</td>\n",
|
||
" <td>0.26000</td>\n",
|
||
" <td>0.62857</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-09</th>\n",
|
||
" <td>0.00</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-10</th>\n",
|
||
" <td>0.00</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" <td>0.00000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-11</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06036</td>\n",
|
||
" <td>0.09000</td>\n",
|
||
" <td>0.12000</td>\n",
|
||
" <td>0.24000</td>\n",
|
||
" <td>0.57000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" overnight 1 week 2 weeks 1 months 2 months 3 months \\\n",
|
||
"date \n",
|
||
"2010-01-01 0.00 0.00000 0.00000 0.00000 0.00000 0.00000 \n",
|
||
"2010-01-02 0.00 0.00000 0.00000 0.00000 0.00000 0.00000 \n",
|
||
"2010-01-03 0.00 0.00000 0.00000 0.00000 0.00000 0.00000 \n",
|
||
"2010-01-04 0.03 0.04971 0.05000 0.07964 0.11893 0.15679 \n",
|
||
"2010-01-05 0.03 0.04971 0.05000 0.07964 0.11000 0.15000 \n",
|
||
"2010-01-06 0.03 0.04900 0.04971 0.08000 0.11000 0.14000 \n",
|
||
"2010-01-07 0.03 0.04971 0.04971 0.06964 0.10000 0.13000 \n",
|
||
"2010-01-08 0.03 0.04971 0.04971 0.06964 0.10000 0.13000 \n",
|
||
"2010-01-09 0.00 0.00000 0.00000 0.00000 0.00000 0.00000 \n",
|
||
"2010-01-10 0.00 0.00000 0.00000 0.00000 0.00000 0.00000 \n",
|
||
"2010-01-11 0.03 0.04971 0.04971 0.06036 0.09000 0.12000 \n",
|
||
"\n",
|
||
" 6 months 12 months \n",
|
||
"date \n",
|
||
"2010-01-01 0.00000 0.00000 \n",
|
||
"2010-01-02 0.00000 0.00000 \n",
|
||
"2010-01-03 0.00000 0.00000 \n",
|
||
"2010-01-04 0.31571 0.71429 \n",
|
||
"2010-01-05 0.29929 0.68929 \n",
|
||
"2010-01-06 0.28000 0.66857 \n",
|
||
"2010-01-07 0.26000 0.62857 \n",
|
||
"2010-01-08 0.26000 0.62857 \n",
|
||
"2010-01-09 0.00000 0.00000 \n",
|
||
"2010-01-10 0.00000 0.00000 \n",
|
||
"2010-01-11 0.24000 0.57000 "
|
||
]
|
||
},
|
||
"execution_count": 90,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"hibor.fillna(0)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d03e428d",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 8.4 Fill with previous values (i.e. forward fill)\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 92,
|
||
"id": "0576a4c1",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>overnight</th>\n",
|
||
" <th>1 week</th>\n",
|
||
" <th>2 weeks</th>\n",
|
||
" <th>1 months</th>\n",
|
||
" <th>2 months</th>\n",
|
||
" <th>3 months</th>\n",
|
||
" <th>6 months</th>\n",
|
||
" <th>12 months</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-01</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-02</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-03</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-04</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.05000</td>\n",
|
||
" <td>0.07964</td>\n",
|
||
" <td>0.11893</td>\n",
|
||
" <td>0.15679</td>\n",
|
||
" <td>0.31571</td>\n",
|
||
" <td>0.71429</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-05</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.05000</td>\n",
|
||
" <td>0.07964</td>\n",
|
||
" <td>0.11000</td>\n",
|
||
" <td>0.15000</td>\n",
|
||
" <td>0.29929</td>\n",
|
||
" <td>0.68929</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-06</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04900</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.08000</td>\n",
|
||
" <td>0.11000</td>\n",
|
||
" <td>0.14000</td>\n",
|
||
" <td>0.28000</td>\n",
|
||
" <td>0.66857</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-07</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06964</td>\n",
|
||
" <td>0.10000</td>\n",
|
||
" <td>0.13000</td>\n",
|
||
" <td>0.26000</td>\n",
|
||
" <td>0.62857</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-08</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06964</td>\n",
|
||
" <td>0.10000</td>\n",
|
||
" <td>0.13000</td>\n",
|
||
" <td>0.26000</td>\n",
|
||
" <td>0.62857</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-09</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06964</td>\n",
|
||
" <td>0.10000</td>\n",
|
||
" <td>0.13000</td>\n",
|
||
" <td>0.26000</td>\n",
|
||
" <td>0.62857</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-10</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06964</td>\n",
|
||
" <td>0.10000</td>\n",
|
||
" <td>0.13000</td>\n",
|
||
" <td>0.26000</td>\n",
|
||
" <td>0.62857</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-11</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06036</td>\n",
|
||
" <td>0.09000</td>\n",
|
||
" <td>0.12000</td>\n",
|
||
" <td>0.24000</td>\n",
|
||
" <td>0.57000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" overnight 1 week 2 weeks 1 months 2 months 3 months \\\n",
|
||
"date \n",
|
||
"2010-01-01 NaN NaN NaN NaN NaN NaN \n",
|
||
"2010-01-02 NaN NaN NaN NaN NaN NaN \n",
|
||
"2010-01-03 NaN NaN NaN NaN NaN NaN \n",
|
||
"2010-01-04 0.03 0.04971 0.05000 0.07964 0.11893 0.15679 \n",
|
||
"2010-01-05 0.03 0.04971 0.05000 0.07964 0.11000 0.15000 \n",
|
||
"2010-01-06 0.03 0.04900 0.04971 0.08000 0.11000 0.14000 \n",
|
||
"2010-01-07 0.03 0.04971 0.04971 0.06964 0.10000 0.13000 \n",
|
||
"2010-01-08 0.03 0.04971 0.04971 0.06964 0.10000 0.13000 \n",
|
||
"2010-01-09 0.03 0.04971 0.04971 0.06964 0.10000 0.13000 \n",
|
||
"2010-01-10 0.03 0.04971 0.04971 0.06964 0.10000 0.13000 \n",
|
||
"2010-01-11 0.03 0.04971 0.04971 0.06036 0.09000 0.12000 \n",
|
||
"\n",
|
||
" 6 months 12 months \n",
|
||
"date \n",
|
||
"2010-01-01 NaN NaN \n",
|
||
"2010-01-02 NaN NaN \n",
|
||
"2010-01-03 NaN NaN \n",
|
||
"2010-01-04 0.31571 0.71429 \n",
|
||
"2010-01-05 0.29929 0.68929 \n",
|
||
"2010-01-06 0.28000 0.66857 \n",
|
||
"2010-01-07 0.26000 0.62857 \n",
|
||
"2010-01-08 0.26000 0.62857 \n",
|
||
"2010-01-09 0.26000 0.62857 \n",
|
||
"2010-01-10 0.26000 0.62857 \n",
|
||
"2010-01-11 0.24000 0.57000 "
|
||
]
|
||
},
|
||
"execution_count": 92,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"hibor.fillna(method='ffill')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d916df12",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 8.5 Fill with next values (i.e. back fill)\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "aff4fd55",
|
||
"metadata": {},
|
||
"source": [
|
||
"Remark: may not make sense in this example"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 95,
|
||
"id": "19fdac25",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>overnight</th>\n",
|
||
" <th>1 week</th>\n",
|
||
" <th>2 weeks</th>\n",
|
||
" <th>1 months</th>\n",
|
||
" <th>2 months</th>\n",
|
||
" <th>3 months</th>\n",
|
||
" <th>6 months</th>\n",
|
||
" <th>12 months</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>date</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-01</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.05000</td>\n",
|
||
" <td>0.07964</td>\n",
|
||
" <td>0.11893</td>\n",
|
||
" <td>0.15679</td>\n",
|
||
" <td>0.31571</td>\n",
|
||
" <td>0.71429</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-02</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.05000</td>\n",
|
||
" <td>0.07964</td>\n",
|
||
" <td>0.11893</td>\n",
|
||
" <td>0.15679</td>\n",
|
||
" <td>0.31571</td>\n",
|
||
" <td>0.71429</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-03</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.05000</td>\n",
|
||
" <td>0.07964</td>\n",
|
||
" <td>0.11893</td>\n",
|
||
" <td>0.15679</td>\n",
|
||
" <td>0.31571</td>\n",
|
||
" <td>0.71429</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-04</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.05000</td>\n",
|
||
" <td>0.07964</td>\n",
|
||
" <td>0.11893</td>\n",
|
||
" <td>0.15679</td>\n",
|
||
" <td>0.31571</td>\n",
|
||
" <td>0.71429</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-05</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.05000</td>\n",
|
||
" <td>0.07964</td>\n",
|
||
" <td>0.11000</td>\n",
|
||
" <td>0.15000</td>\n",
|
||
" <td>0.29929</td>\n",
|
||
" <td>0.68929</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-06</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04900</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.08000</td>\n",
|
||
" <td>0.11000</td>\n",
|
||
" <td>0.14000</td>\n",
|
||
" <td>0.28000</td>\n",
|
||
" <td>0.66857</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-07</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06964</td>\n",
|
||
" <td>0.10000</td>\n",
|
||
" <td>0.13000</td>\n",
|
||
" <td>0.26000</td>\n",
|
||
" <td>0.62857</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-08</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06964</td>\n",
|
||
" <td>0.10000</td>\n",
|
||
" <td>0.13000</td>\n",
|
||
" <td>0.26000</td>\n",
|
||
" <td>0.62857</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-09</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06036</td>\n",
|
||
" <td>0.09000</td>\n",
|
||
" <td>0.12000</td>\n",
|
||
" <td>0.24000</td>\n",
|
||
" <td>0.57000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-10</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06036</td>\n",
|
||
" <td>0.09000</td>\n",
|
||
" <td>0.12000</td>\n",
|
||
" <td>0.24000</td>\n",
|
||
" <td>0.57000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2010-01-11</th>\n",
|
||
" <td>0.03</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.04971</td>\n",
|
||
" <td>0.06036</td>\n",
|
||
" <td>0.09000</td>\n",
|
||
" <td>0.12000</td>\n",
|
||
" <td>0.24000</td>\n",
|
||
" <td>0.57000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" overnight 1 week 2 weeks 1 months 2 months 3 months \\\n",
|
||
"date \n",
|
||
"2010-01-01 0.03 0.04971 0.05000 0.07964 0.11893 0.15679 \n",
|
||
"2010-01-02 0.03 0.04971 0.05000 0.07964 0.11893 0.15679 \n",
|
||
"2010-01-03 0.03 0.04971 0.05000 0.07964 0.11893 0.15679 \n",
|
||
"2010-01-04 0.03 0.04971 0.05000 0.07964 0.11893 0.15679 \n",
|
||
"2010-01-05 0.03 0.04971 0.05000 0.07964 0.11000 0.15000 \n",
|
||
"2010-01-06 0.03 0.04900 0.04971 0.08000 0.11000 0.14000 \n",
|
||
"2010-01-07 0.03 0.04971 0.04971 0.06964 0.10000 0.13000 \n",
|
||
"2010-01-08 0.03 0.04971 0.04971 0.06964 0.10000 0.13000 \n",
|
||
"2010-01-09 0.03 0.04971 0.04971 0.06036 0.09000 0.12000 \n",
|
||
"2010-01-10 0.03 0.04971 0.04971 0.06036 0.09000 0.12000 \n",
|
||
"2010-01-11 0.03 0.04971 0.04971 0.06036 0.09000 0.12000 \n",
|
||
"\n",
|
||
" 6 months 12 months \n",
|
||
"date \n",
|
||
"2010-01-01 0.31571 0.71429 \n",
|
||
"2010-01-02 0.31571 0.71429 \n",
|
||
"2010-01-03 0.31571 0.71429 \n",
|
||
"2010-01-04 0.31571 0.71429 \n",
|
||
"2010-01-05 0.29929 0.68929 \n",
|
||
"2010-01-06 0.28000 0.66857 \n",
|
||
"2010-01-07 0.26000 0.62857 \n",
|
||
"2010-01-08 0.26000 0.62857 \n",
|
||
"2010-01-09 0.24000 0.57000 \n",
|
||
"2010-01-10 0.24000 0.57000 \n",
|
||
"2010-01-11 0.24000 0.57000 "
|
||
]
|
||
},
|
||
"execution_count": 95,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"hibor.fillna(method='bfill')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "785c9ec4",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 9. Export CSV"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "fd978fcf",
|
||
"metadata": {},
|
||
"source": [
|
||
"Export dataframe to a csv. Remember don't override the original file!"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 46,
|
||
"id": "3e53c244",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"aapl_proper_index.to_csv(\"AAPL_new.csv\")"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3 (ipykernel)",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.11.0"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|