114 lines
5.0 KiB
Markdown
114 lines
5.0 KiB
Markdown
user: kara1006kt24329, kkkkkkyyyy, tunmnlu
|
||
|
||
tlou31
|
||
|
||
Dead line: 16-Sep-2023
|
||
|
||
### Q1 HKD500
|
||
|
||
### Q2 HKD500
|
||
|
||
### Q3 HKD300
|
||
|
||
### Q4 HKD200
|
||
|
||
### Q5 HKD200
|
||
|
||
### question need to ask
|
||
|
||
- know how to use virtualbox ?
|
||
- setup ?
|
||
- if the quotation is ok
|
||
- GT Username
|
||
|
||
## slices from the materials
|
||
|
||
### Q1 [40 points] Collect data from TMDb to build a co-actor network
|
||
|
||
- requirement:
|
||
- Python 3.7.x only [](https://docs.python.org/3/library/)
|
||
- TMDb API version 3
|
||
- Max runtime: 10 minutes
|
||
- Deliverables:
|
||
- Q1.py: The completed Python file
|
||
- nodes.csv: The csv file containing nodes
|
||
- edges.csv: The csv file containing edges
|
||
- glance:
|
||
- the TMDbAPIUtils class, and the one global function.
|
||
- The Graph class will serve as a re-usable way to represent and write out your collected graph data.
|
||
- The TMDbAPIUtils class will be used to work with the TMDB API for data retrieval.
|
||
|
||
- Tasks and point breakdown
|
||
- [10 pts] Implementation of the Graph class according to the instructions in Q1.py.
|
||
- [10 pts] Implementation of the TMDbAPIUtils class according to instructions in Q1.py.
|
||
- [20 pts] Producing correct nodes.csv and edges.csv.
|
||
|
||
### Q2 [35 pOiNtS] SQLItE
|
||
|
||
- Construct a TMDb database in SQLite.
|
||
- Technology:
|
||
- SQLite release 3.22
|
||
- Python 3.6.x only
|
||
- Do not modify import statements.
|
||
- Max runtime: 10 minutes
|
||
- Deliverables:
|
||
- Q2.py: Modified file containing all the SQL statements you have used to answer parts a - h in the proper sequence.
|
||
- Tasks and point breakdown
|
||
- [9 points] Create tables and import data.
|
||
- [2 points] Create two tables
|
||
- [2 points] Import the provided movies.csv file into the movies table and movie_cast.csv into the movie_cast table
|
||
- [5 points] Vertical Database Partitioning.
|
||
- [1 point] Create indexes.
|
||
- [3 points] Calculate a proportion.
|
||
- [4 points] Find the most prolific actors
|
||
- [4 points] Identify the highest scoring movies while favoring small cast size.
|
||
- [4 points] Get high scoring actors.
|
||
- [6 points] Creating views. Create a view (virtual table) called good_collaboration
|
||
- [4 points] SQLite supports simple but powerful Full Text Search (FTS)
|
||
- [1 point] Count the number of movies whose overview field
|
||
- [2 points] Count the number of movies that contain the terms ‘space’ and ‘program’ in the overview field
|
||
|
||
### Q3 [15 points] D3 (V5) Warmup
|
||
|
||
- Visualize temporal trends in movie releases using D3 to showcase
|
||
- D3 Version 5 (included in the lib folder)
|
||
- Chrome 97.0 (or newer): the browser for grading your code
|
||
- Python http server (for local testing)
|
||
- D3 library is provided to you in the lib folder. You must NOT use any D3 libraries (d3*.js) other than the ones provided.
|
||
- [Gradescope] Q3.html: Modified file containing all html, javascript, and any css code required to produce the bar plot. Do not include the D3 libraries or q3.csv dataset.
|
||
|
||
- Tasks and point breakdown
|
||
[3.5 points] The bar plot must display one bar per row in the q3.csv dataset.
|
||
[1 point] The bars must have the same fixed width, and there must be some space between two bars, so that the bars do not overlap.
|
||
[3 points] The plot must have visible X and Y axes that scale according to the generated bars.
|
||
[2 points] Set x-axis label to ‘Year’ and y-axis label to ‘Running Total’. The x-axis label must be a <text> element having the id: “x_axis_label” and the y-axis label must be a <text> element having the id: “y_axis_label”.
|
||
[1 point] Use a linear scale for the Y axis to represent the running total
|
||
[3 points] Use a time scale for the x-axis to represent year
|
||
[1 point] Set the HTML title tag and display a title for the plot.
|
||
[ 0.5 points] Add your GT username (usually includes a mix of letters and numbers) to the area beneath the bottom-right of the plot
|
||
|
||
- caution:
|
||
the autograder requires the following DOM structure
|
||
- specific structure required
|
||
|
||
### Q4 [5 points] OPeNREFINE
|
||
|
||
- https://openrefine.org/
|
||
|
||
- Deliverables
|
||
- properties_clean.csv : Export the final table as a csv file.
|
||
- changes.json : Submit a list of changes made to file in json format. Go to 'Undo/Redo' Tab -> 'Extract' -> 'Export'. This downloads 'history.json' . Rename it to 'changes.json'.
|
||
- Q4Observations.txt : A text file with answers to parts c.i, c.ii, c.iii, c.iv, c.v, c.vi. Provide each answer in a new line in the output format specified.
|
||
|
||
### Q5 [5 points] Introduction to Python Flask
|
||
|
||
- wrangling_scripts/Q5.py
|
||
- Build a web application that displays a table of TMDb data
|
||
- Python 3.7.x only
|
||
|
||
Deliverables
|
||
[Gradescope] Q5.py: Completed Python file with your change
|
||
Username() - Update the username() method inside Q5.py by including your GTUsername.
|
||
|
||
You must solve the following 2 sub-questions:
|