# DVA-CSE6242-Spring2021 Assignments from the CSE6242 course at Georgia Tech This repository contains assignments from the CSE6242 course that I took in Spring 2021 at Georgia Tech. In this class, I worked on building the following skills: - Data Collection, Validation, Preprocessing and Analysis using Python - Data Visualization using Scala, JavaScript D3 - Clustering Methods - Scalable Computing using Pig, Hive, Spark, AWS, Hadoop - Graph Analytics using PageRank Algorithms - Real-world data collection and analysis - Capstone Project The course homepage and syllabus can be viewed here: [Course Website](https://poloclub.github.io/cse6242-2020spring-campus/)