SDSU CS 649 Big Data: Tools and Methods
Spring Semester, 2022
Lecture Notes
DCS
To Course Web Site
San Diego State University -- This page last updated 3-May-22

This page contains links to lecture notes for the CS 649 Big Data: Tools and Methods course. This page will be updated as more notes become available.

Lecture Notes By Topic
  1. Course Introduction
  2. Big Data Intro
  3. Python
  4. SciPy
  5. Panda Series notebook zipped slides
  6. Dataframe notebook zipped slides
  7. Data Manipulation notebook zipped slides
  8. Ploting notebook zipped slides
  9. Dashboards
  10. Spark Intro
  11. Spark-Panda API
  12. PySpark 2
  13. Statistics, Sampling, Bloom
  14. Memoization
  15. Panda Alternatives
  16. Regression
  17. Assignment 1
  18. Scikit Learn, Bayes
  19. Clustering
  20. Spark ML
  21. Assignment 2 Dates
  22. Spark Clustering
  23. Running Spark
  24. Running Spark, Partition
  25. No SQL, Cassandra
  26. Kafka
  27. Kafka, Spark Streaming
  28. Exam Comments
  29. Kafka Pipelines, Mircoservices
  30. End Remarks

Lecture Video By Date
Tuesday Thursday
Jan 18 Jan 20 Course Intro
Jan 25 Big Data Intro Jan 27 Python, SciPy, Panda Series
Feb 1 Dataframe, Data Manipulation Feb 3 Data Manipulation
Feb 8 Ploting Feb 10 Dashboards, Spark Intro
Feb 15 Spark-Panda API, PySpark 2 Feb 17 PySpark 2, Statistics
Feb 22 Statistics, Sampling Feb 24 Sampling, Bloom, Panda Alternatives
Mar 1 Panda Alternatives Mar 3 Regression
Mar 8 Assignment 1, Regression Mar 10 Regression, Scikit Learn, Bayes
Mar 15 Clustering Mar 17 Spark ML
Mar 22 Spark ML Mar 24 Spark Clustering
Mar 29 No Class Spring Break Mar 31 No Class Spring Break
Apr 5 Apr 7
Apr 12 Running Spark Apr 14 Running Spark, Partition
Apr 19 No SQL, Cassandra Apr 21 Cassandra
Apr 26 Kafka Apr 28 Kafka, Spark Streaming
May 3 May 5
May 10 May 12 Project Due