Fall 2024 198:210
Data Management for Data Science

Tuesdays-Thursdays, 5:40-7:00pm, BE-100, Livingston Campus

Instructor: Amélie Marian
Office hours: Tuesdays 2-3pm CoRE 324 

Announcements

Class Announcements will be posted via Canvas. If you are registered for the course and do not see the course on Canvas (once the semester has started), please contact the instructor.
TA contact information and office hours posted on Canvas

9/2: Recitations will start on 9/9

 

Course Description

This course is designed to provide students with the knowledge and skills needed to acquire and curate real word data, to explore the data to discover patterns and distributions, and to manage large datasets with databases.

Students will learn the minimal aspects of Python as needed to acquire and curate datasets. Much of their work will be done using Python libraries that deliver maximum benefit with minimal programming effort: to get data from various online data sources online, detect which aspects of data are incomplete or unreliable and understand why it is so, learn various domain independent and domain dependent ways to curate the data, and get the curated data into a form that can be explored, managed and analyzed. Students will also learn how to get datasets into database-ready form, and do basic analysis of such datasets using relational databases and SQL, and (time permitting) NoSQL databases.

The course content is designed to be accessible to all SAS students regardless of their major. Some small amount of programming background is expected via CS 111 (Java) or CS 142 (R).

Recitations

Recitations will consist of labs, using Python Jupyter notebooks.

Section 1: Friday 10:35am-11:30am Livingston BE-250

Section 2: Friday 12:25pm-1:20pm Livingston BE-250

Section 3: Wednesday 5:55pm-6:50pm Livingston BRR-5105

Section 4: Wednesday 12:25-1:20pm Livingston TIL-226

Prerequisites

CS111 or CS142, or by permission of instructor.

 

Grading

Grading will be based on:

·       Participation Activities - 5%

·       Homework Assignments (3-4) – 30%

·       Online Quizzes (4) – 20%

·       Midterms (2) – 25%

·       Final Exam (1) – 20%

Regrade requests must be raised within one week of grades being returned. After one week, grades are considered final.

 

Readings

The course will use a Zybook for reading and participation activities.

 

Schedule (To be Confirmed)

 

Date

Topics

Homework and Quizzes

Part 1: Introduction to Python

Tue September 3
Thu September 5


Introduction to Data Science.
Introduction to Python.
Basic Python Elements:Variables, Expression, Types


Tue September 10
Thu September 12
Tue September 17

 

Python Programming Logic.
Functions, Branching, Modules
Python Lists and Strings


Thu September 19

Lecture Canceled

Quiz 1


Tue September 24
Thu September 26 

Python Strings and Dictionaries.

Loops, Exceptions

 

Tue October 1 

Midterm 1 Exam

Part 2: Python Libraries


Thu October 3
Tue October 8

Importing/Exporting data from files
Working with CSV and JSON


Thu October 10
Tue October 15
Thu October 17


Managing Numeric Data with Numpy
Storing and Analyzing Data with Pandas
Graphing Data with Matplotlib

     Quiz 2

Part 3: Data Wrangling with Python

Tue October 22
Thu October 24
Tue October 29 

Storing and Analyzing Data with Pandas
Pandas Dataframes

     Assignment 1 due

Thu October 31 

  Midterm Exam

Tue November 5
Thu November 7
Tue November 12


Transforming and Querying Data
Graphing Data with Matplotlib

Visualizing Data

 



Thu November 14
Tue November 19


Data Cleaning
String Manipulations

     Assignment 2
    Quiz 3

Part 4: Introduction to Databases

Thu November 21
Tue November 26  

Relational Databases and SQL 

    Assignment 3 due (tentative)

Thanksgiving break

Tue December 3
Thu December 5 


Relational Databases and SQL (cont.)
NoSQL Data Management

    Quiz 4 (tentative)

Tue December 10 

Wrapping up. 

   
    Assignment 4 due (tentative)

Tue December 17
8-11pm 

    Final Exam

 

Course Policies

Attendance and Participation 

Students are expected to be present and participate in class but should prioritize their health and safety.

 

Disability Accommodations 

Students in need of disability accommodations to register for accommodations and consult the policies and procedures of the Office of Disability Services website: https://ods.rutgers.edu 

 

Academic Integrity Policy 

Rutgers University takes academic dishonesty very seriously. By enrolling in this course, you assume responsibility for familiarizing yourself with the Academic Integrity Policy and the possible penalties (including suspension and expulsion) for violating the policy. As per the policy, all suspected violations will be reported to the Office of Student Conduct. Academic dishonesty includes (but is not limited to):

·       Cheating 

·       Plagiarism 

·       Aiding others in committing a violation or allowing others to use your work 

·       Failure to cite sources correctly 

·       Fabrication 

·       Using another person's ideas or words without attribution, including re-using a previous assignment Unauthorized collaboration 

·       Sabotaging another student's work 

If you are ever in doubt, consult your instructor.

Please familiarize yourself with the University Academic Intgrity Policy http://nbacademicintegrity.rutgers.edu/

 

Student Support and Mental Wellness 

In the last few years, we have all been going through a lot, individually and together. It is important to acknowledge that events and circumstances outside of the classroom can impact our ability to be present and engaged at any given moment. At Rutgers, we are focused on the whole student. If, at any point, you experience anything impacting your performance or ability to participate in this class, please reach out to me. Please also see the academic, health, and mental wellness resources on the syllabus as well as others searchable at https://success.rutgers.edu/ for further support.

 

Additional support resources:

·       Student Success Essentials: https://success.rutgers.edu 

·       Student Support Services: https://www.rutgers.edu/academics/student-support 

·       The Learning Centers: https://rlc.rutgers.edu/ 

·       The Writing Centers (including Tutoring and Writing Coaching): https://writingctr.rutgers.edu 

·       Rutgers Libraries: https://www.libraries.rutgers.edu/ 

·       Office of Veteran and Military Programs and Services: https://veterans.rutgers.edu 

·       Student Health Services: http://health.rutgers.edu/ 

·       Counseling, Alcohol and Other Drug Assistance Program & Psychiatric Services (CAPS): http://health.rutgers.edu/medical-counseling-services/counseling/ 

·       Office for Violence Prevention and Victim Assistance: www.vpva.rutgers.edu/