5.0 KiB
5.0 KiB
user: kara1006kt24329, kkkkkkyyyy, tunmnlu
tlou31
Dead line: 16-Sep-2023
Q1 HKD500
Q2 HKD500
Q3 HKD300
Q4 HKD200
Q5 HKD200
question need to ask
- know how to use virtualbox ?
- setup ?
- if the quotation is ok
- GT Username
slices from the materials
Q1 [40 points] Collect data from TMDb to build a co-actor network
- requirement:
- Python 3.7.x only [](https://docs.python.org/3/library/)
- TMDb API version 3
- Max runtime: 10 minutes
- Deliverables:
- Q1.py: The completed Python file
- nodes.csv: The csv file containing nodes
- edges.csv: The csv file containing edges
- glance:
- the TMDbAPIUtils class, and the one global function.
- The Graph class will serve as a re-usable way to represent and write out your collected graph data.
- The TMDbAPIUtils class will be used to work with the TMDB API for data retrieval.
- Tasks and point breakdown
- [10 pts] Implementation of the Graph class according to the instructions in Q1.py.
- [10 pts] Implementation of the TMDbAPIUtils class according to instructions in Q1.py.
- [20 pts] Producing correct nodes.csv and edges.csv.
Q2 [35 pOiNtS] SQLItE
- Construct a TMDb database in SQLite.
- Technology:
- SQLite release 3.22
- Python 3.6.x only
- Do not modify import statements.
- Max runtime: 10 minutes
- Deliverables:
- Q2.py: Modified file containing all the SQL statements you have used to answer parts a - h in the proper sequence.
- Tasks and point breakdown
- [9 points] Create tables and import data.
- [2 points] Create two tables
- [2 points] Import the provided movies.csv file into the movies table and movie_cast.csv into the movie_cast table
- [5 points] Vertical Database Partitioning.
- [1 point] Create indexes.
- [3 points] Calculate a proportion.
- [4 points] Find the most prolific actors
- [4 points] Identify the highest scoring movies while favoring small cast size.
- [4 points] Get high scoring actors.
- [6 points] Creating views. Create a view (virtual table) called good_collaboration
- [4 points] SQLite supports simple but powerful Full Text Search (FTS)
- [1 point] Count the number of movies whose overview field
- [2 points] Count the number of movies that contain the terms ‘space’ and ‘program’ in the overview field
Q3 [15 points] D3 (V5) Warmup
- Visualize temporal trends in movie releases using D3 to showcase
- D3 Version 5 (included in the lib folder)
- Chrome 97.0 (or newer): the browser for grading your code
- Python http server (for local testing)
- D3 library is provided to you in the lib folder. You must NOT use any D3 libraries (d3*.js) other than the ones provided.
- [Gradescope] Q3.html: Modified file containing all html, javascript, and any css code required to produce the bar plot. Do not include the D3 libraries or q3.csv dataset.
- Tasks and point breakdown
[3.5 points] The bar plot must display one bar per row in the q3.csv dataset.
[1 point] The bars must have the same fixed width, and there must be some space between two bars, so that the bars do not overlap.
[3 points] The plot must have visible X and Y axes that scale according to the generated bars.
[2 points] Set x-axis label to ‘Year’ and y-axis label to ‘Running Total’. The x-axis label must be a <text> element having the id: “x_axis_label” and the y-axis label must be a <text> element having the id: “y_axis_label”.
[1 point] Use a linear scale for the Y axis to represent the running total
[3 points] Use a time scale for the x-axis to represent year
[1 point] Set the HTML title tag and display a title for the plot.
[ 0.5 points] Add your GT username (usually includes a mix of letters and numbers) to the area beneath the bottom-right of the plot
- caution:
the autograder requires the following DOM structure
- specific structure required
Q4 [5 points] OPeNREFINE
- https://openrefine.org/
- Deliverables
- properties_clean.csv : Export the final table as a csv file.
- changes.json : Submit a list of changes made to file in json format. Go to 'Undo/Redo' Tab -> 'Extract' -> 'Export'. This downloads 'history.json' . Rename it to 'changes.json'.
- Q4Observations.txt : A text file with answers to parts c.i, c.ii, c.iii, c.iv, c.v, c.vi. Provide each answer in a new line in the output format specified.
Q5 [5 points] Introduction to Python Flask
- wrangling_scripts/Q5.py
- Build a web application that displays a table of TMDb data
- Python 3.7.x only
Deliverables [Gradescope] Q5.py: Completed Python file with your change Username() - Update the username() method inside Q5.py by including your GTUsername.
You must solve the following 2 sub-questions: