Fall 2022 198:142
Data 101 - Data Literacy

Tuesdays-Thursdays, 5:40-7:00pm, ARC-103, Busch Campus

Instructor: Amélie Marian (amelie@cs)

Recitations:

Office hours:
Amélie Marian: Tuesdays 2-3pm CoRE 324

Mondays 3pm-4pm Ekta Dhobley, https://rutgers.webex.com/rutgers/j.php?MTID=m8199e5af516b016c65efb7dca473bdc1
Mondays 7:30-8:30pm Aditya Maheshwari, RUTCOR 111
Tuesdays 8-9pm Aditya Maheshwari, https://rutgers.zoom.us/j/97880854475?pwd=aEVyOG95bEt1NFRuWGtqcDFlK0FBQT09
Wednesdays 9:45-10:45am Jash Gagliani, RUTCOR 111
Wednesdays 10:45-11:45am Neel Doshi, RUTCOR 111
Thursdays 7:30-8:30 pm Jash Gagliani, https://rutgers.webex.com/meet/jg1700
Fridays 10:30-11:30am Neel Doshi, https://rutgers.zoom.us/j/2903166697?pwd=ZlkxaXl6VmZEOXZVU1pJZDVTSmo0Zz09
Fridays 4-5pm Rohit Upadhyay, RUTCOR 111
Fridays 4-5pm Janish Parikh, https://rutgers.webex.com/rutgers/j.php?MTID=m8403f0f94038e2dcf8cc54f4c71871f0
Saturdays 9-10am Kunj Mehta, https://rutgers.webex.com/meet/kcm161


Announcements

Recitations will start on 9/12
Class Announcements will be posted via Canvas. If you are registered for the course and do not see the course on Canvas (once the semester has started), please contact the instructor.



Course Description

"Big Data," algorithms, and statistics are everywhere today. But how do you tell good data from bad? Misinformation from useful analysis? And who owns the information about our lives and decisions?

Data 101 will help you improve your data literacy and develop a healthy skepticism about empirical claims presented in the popular media. We will explore examples of erroneous, rushed and ad hoc conclusions based on so-called "big data," and you will get hands-on experience analyzing and using data to make persuasive arguments. You will also learn to make more informed decisions about what you find and share online. Along the way, you will learn fundamental concepts in statistics and probability and acquire basic programming skills that will benefit you in your future coursework and beyond.

This course is recommended for students from all schools and disciplines. (The course does require placement into Intermediate Algebra or above, or credit for 01:640:025.) Data 101 can be used to meet the SAS Core Curriculum goals in 21st Century Challenges [21C], Quantitative and Formal Reasoning [QQ or QR], and Information Technology and Research [ITR].

Prerequisite: Some math Knowledge, placement into Intermediate Algebra or above, or credit for 01:640:025


Grading

Grading will be based on weekly assignments (60%), a midterm (20%), a final project (15%), and participation quizzes (5%).


Readings

Readings will be posted on Canvas.


Schedule (tentative)

Date

Topics

Assignments

Part 1: Manipulating Data: Introduction to R

Tue September 6

Introduction.
Data Literacy - what is it?
Readings: The Fine Art of Baloney Detection, Carl Sagan

Thu September 8
Tue September 13
Thu September 15
Tue September 20

Introduction to R. Making plots.
Data Manipulation: Exploring and Transforming Data.
Resource: R Cookbook, 2nd Edition, James (JD) Long, Paul Teetor
R for Data Science, Garrett Grolemund, Hadley Wickham

Assignment 1: Find Interesting Data
Due Monday September 19

Assignment 2: Plot your Data
Due Monday September 26

Assignment 3: Explore a Dataset
Due Monday October 3


Thu September 22
Tue September 27

Data Visualization.
Best and worst practices.
(Optional) reading: Data Visualization: A Practical Introduction, Kieran Healy

Assignment 4: Lie with Data
Due Monday October 10

Part 2: Understanding Data: Statistical Analysis

Thu September 29
Tue October 4
Thu October 6
Tue October 11

Statistics: Data Distribution, z-values.
Testing Hypothesis, p-values.
Permutation Test.
Resource: Introduction to Probability and Statistics Using R, G. Jay Kerns

Assignment 5: Test Hypotheses
Due Monday October 17

Thu October 13
Tue October 18
Thu October 20

P-value hunting and overfitting.
Law of small numbers.

Assignment 6: Data-based arguments
Due Monday October 24

Assignment 7: Creating a Data 101 dataset
Due Monday October 24

Assignment 8: De-anonymize Data
Due Monday November 7

Tue October 25
Thu October 27

Bayesian Reasoning. Bayesian Approach.
Assignment 9: Prior and Posterior Beliefs
Due Monday November 14

Tue November 1

Midterm review

Thu November 3

Midterm Exam (tentative)

Part 4a: Trusting Data; Data and Society

Tue November 8

Personal Data.
Anonymization.

Part 3: Predicting Data: Identifying patterns

Thu November 10
Tue November 15
Thu November 17

Introduction to Prediction.
Decision Trees.
Prediction methods: inference, recommendations

Assignment 10: Decision Trees
Due Monday November 21

Assignment 11: Prediction Challenge 1
Due Monday November 28

Assignment 12: Prediction Challenge 2
Due Monday December 5

Tue November 22

Aggregate Data and Paradoxes.

Thanksgiving break - No recitation this week

Tue November 29

Prediction methods: Regression, inference, recommendations

Part 4b: Trusting Data; Data and Society

Thu December 1
Tue December 6
Thu December 8

Fairness and Bias
Transparency and Accountability
Pitfalls of Data Analysis
Statistical Traps
Power Law. Zipf Law.
Black Swan.

Project discussions

Final Project
Due Monday December 12

Tue December 13

Wrapping up: Prediction competition and Project discussion
Assignment 13 (optional): Prediction Challenge 3
Due Monday December 19


Course Policies

Attendance and Participation

Students are expected to be presentin class and participate in the discussions, but should prioritize their health and safety.
If you cannot attend class, please let the instructor know. Students will not be penalized for notified absences.

Disability Accomodations

Students in need of disability accommodations to register for accommodations and consult the policies and procedures of the Office of Disability Services website: https://ods.rutgers.edu

Civility

Some topics covered in the class will relate to ethics and fairness in data management and decision systems. Students are expected to behave in a respectful manner towards everyone in the course, to ensure that all participants in the class feel welcome and supported.

Academic Integrity Policy

Rutgers University takes academic dishonesty very seriously. By enrolling in this course, you assume responsibility for familiarizing yourself with the Academic Integrity Policy and the possible penalties (including suspension and expulsion) for violating the policy. As per the policy, all suspected violations will be reported to the Office of Student Conduct. Academic dishonesty includes (but is not limited to): If you are ever in doubt, consult your instructor.
Please familiarize yourself with the University Academic Intgrity Policy http://nbacademicintegrity.rutgers.edu/

Student Support and Mental Wellness

In the last few years, we have all been going through a lot, individually and together. It is important to acknowledge that events and circumstances outside of the classroom can impact our ability to be present and engaged at any given moment. At Rutgers, we are focused on the whole student. If, at any point, you experience anything impacting your performance or ability to participate in this class, please reach out to me. Please also see the academic, health, and mental wellness resources on the syllabus as well as others searchable at https://success.rutgers.edu/ for further support.

Additional support resources: