tags
task1
Brief
You are required to collect open data and real-time data.
Part 1: Jupyter notebook Data Analysis and suggestion of actionable items
- Download the Top 200 common passwords by country 2021 database from www.kaggle.com
- Manipulate and rearrange the data if necessary
- Visualize the data using 8 or more charts using Python programming in Jupyter notebook.
- The sunburst chart, heat map, and pair-plot must be used.
- 1 or more 3D chart is essential.
- 1 or more map, such as choropleth map in plotly should be displayed.
- Analyze the charts (and data) which may reveal some facts to us.
- Provide insights and suggest actionable items.
- (You may add other related data set(s) to enrich your insights and suggestions.)
Part 2: Real-time data processing and visualization, in Jupyter notebook.
- Collect and store real-time data using the API of HK Accident and Emergency waiting time (of Hospitals) in NoSQL database (e.g., MongoDB).
- The data collection duration should be 3 or more days, within November and/or December.
- The collection frequency should be every 15 minutes or less.
- Create Jupyter Notebook to read data into a Pandas dataframe.
- (You may export the data, using Mongo Compass, to a json file first.)
- Process and visualize the data.
- Produce 3 or more charts.
- You are encouraged to use python 3D visualization techniques too.
- Analyze the charts (and data) to reveal some facts.
- Provide insights / comments / suggestions.
Items should include:
- Exported collection(s) of the open data / samples of real-time data, from MongoDB
- Jupyter Notebooks that visualize and analyze the data sets, with summary, conclusions and so on in Markdown.
- Demonstrate data collection process and present all results / insights, in a video.
- Upload everything to Moodle 1 week after the last lesson.