Fifa Dataset ETL & Analyze Project

  • Tech Stack: Python, Hadoop, Spark, Hive, Superset.
  • Github URL: Project Link

This is project ETL data from csv files to hive. Then, this data will be analyze with superset

User can setup a bigdata stack to run local. Example structure pyspark project, build dependencies and run spark-submit in local integrate with Hdfs, Hive. Data modeling for raw dataset.