wiki:cs122a-2020-spring

CS122A Spring 2020

Introduction to Data Management: The Online Edition

Course Personnel

Instructor:

Discussion TAs:

Assignment TAs:

Readers:

Meeting Times & Places

Lecture:
Time: Mon/Wed/Fri 4-4:50 PM
Place: Yuja+Piazza
Instructor: Mike

Discussion 1:
Time: Mon 5-5:50 PM
Place: Zoom+Piazza
TA: Glenn

Discussion 2:
Time: Mon 6-6:50 PM
Place: Zoom+Piazza
TA: Glenn

Discussion 3:
Time: Mon 7-7:50 PM
Place: Zoom+Piazza
TA: Glenn

Discussion 4:
Time: Wed 5-5:50 PM
Place: Zoom+Piazza
TA: Kyle

Discussion 5:
Time: Wed 6-6:50 PM
Place: Zoom+Piazza
TA: Kyle

Discussion 6:
Time: Wed 7-7:50 PM
Place: Zoom+Piazza
TA: Kyle


Course Objectives

This course provides students with an introduction to the design of databases and the use of database management systems in support of applications. It covers the entity-relationship (E-R) approach to logical database design. It then covers the relational data model, mapping of E/R designs to relations, relational database design principles, abstract query languages such as the relational algebra and relational calculus, and the industry-standard query language, SQL. It also covers indexing and physical database design, Students will gain exposure to how relational database management systems are used to manage an actual database. Time permitting, the course will also touch briefly on advanced database management topics such as semi-structured data management ("NoSQL") and transactions.

This course is aimed at database design and the use of database management systems in building database applications. It feeds into a follow-on project course, CS122B, whose focus is data-centric Web applications. The CS122A/B course sequence does NOT cover the internal workings of database systems; that material is covered in the undergraduate course CS122C (or its graduate equivalent, CS222) and the graduate-level follow-on course CS223. (The course textbook also delves further into that material for those students who are curious about what goes on under the hood.) Interested students are strongly encouraged to take one, two, or all of these courses; CS122B and CS122C/CS222 are independent, as each one requires only CS122A as their required database background. Also in the works as a CS122A follow-on course is CS1222D, whose focus is on new post-relational data management technologies, i.e., "NoSQL" databases and Big Data management platforms.

Prerequisites

Students should ideally have some experience programming in Python, Java, C++, or C#.

Required Textbooks

Database Management Systems (3rd Edition) by Raghu Ramakrishnan and Johannes Gehrke (a.k.a. "the Cow book"). (Note: Older editions are probably okay too, as long as you do the section number remapping.)


Topic Coverage and Exam Schedule

Syllabus

Topic Reading
Databases and DB Systems Ch. 1
Entity-Relationship (E-R) Data Model Ch. 2.1-2.5, 2.8
Relational Data Model Ch. 3.1-3.2
E-R to Relational Translation 3.5
Relational Design Theory Ch. 19.1-19.6, 20.8
Midterm Exam 1 Mon, Apr 27 (during lecture time)
Relational Algebra Ch. 4.1-4.2
Relational Calculus Ch. 4.3-4.4
SQL Basics (SPJ and Nested Queries) Ch. 3.4, 5.1-5.3
SQL Analytics (Aggregation, Nulls, and Outer Joins) Ch. 5.4-5.6
Advanced SQL Goodies (Constraints, Triggers, Views, and Security) Ch. 3.3, 3.6, 5.7-5.9, 21.1-21.3, 21.7
Midterm Exam 2 Wed, May 20 (during lecture time)
Tree-Based Indexing Ch. 9.1, 8.1-8.3, 10.1-10.2
Hash-Based Indexing Ch. 10.3-10.8, 11.1
Physical DB Design Ch. 8.5, 20.1-20.7
Semistructured Data Management (a.k.a. NoSQL) AsterixDB SQL++ Primer, Couchbase SQL++ Book
Basics of Transactions Ch. 16 and Lecture Notes
Endterm Exam Fri, Jun 5 (during lecture time)

Midterm Exam 1

Time: Mon, Apr 27, 4-4:50 PM
Place: Gradescope
Midterm 1 Midterm 1 Solution

Midterm Exam 2

Time: Wed, May 20, 4-4:50 PM
Place: Gradescope
Midterm 2 Midterm 2 Solution

Endterm Exam

Time: Fri, Jun 5, 4-4:50 PM
Place: Gradescope
Endterm Endterm Solution

Helpful Exam Resources

(Example midterm 1)
(Solution to example midterm 1)
(Example midterm 2)
(Solution to example midterm 2)
(Example endterm)
(Solution to example endterm)


Exams, Assignments, and Grading

Grading Criteria

Exams: 39% (3 x 13%)
Homework: 56% (7 x 8%)
Quizzes: 3%
Piazza: 2%

Homework and Participation

Homework assignments must be turned in by the assigned due dates/times. Details of how to turn in a given assignment will be included in each assignment's handout. The 56% of your grade attributed to homework will be based on your top 7 out of 8 homework scores. (You can spend the 8th one however you like, but if you decide to skip an assignment, do keep in mind that the material will still be on the exams and that the homework is intended to be a useful study/practice tool for those -- you will still be responsible for mastering all "skipped" material.) There will also be short weekly quizzes associated with the discussion sections to give you further practice with the course material; those will count very little towards your final grade (they will contribute 3%), and they're just there for your benefit (to give you another chance to check your understanding of things and see what questions you might want to ask the TA).

Grade Change Policy

For all of the graded assignments as well as the two midterm exams, if you disagree with the grading, you may raise your concerns with the relevant instructor (Professor, TA, or Reader) within two weeks after they are returned. After that, all grades will be considered final. Gradescope's regrading request feature will be the place to handle grading issues. Do not wait until the end of the term to raise issues, as two weeks means two weeks, and no regrading requests will be entertained once final grades have been posted.

Collaboration Policy

Homework assignments are to be completed individually, but you are encouraged to pair up with a fellow student -- your brainstorming buddy -- for the duration of the quarter. It is okay to discuss assignments with other peers as well, e.g., to clarify details of an assignment or to compare thoughts on very rough approaches, but discussions of the details of your work are to stay within your team of two. You should pick this brainstorming partner at the start of the term and then stay with that same person for the remainder of the quarter. See http://www.ics.uci.edu/ugrad/policies/index.php#academic_honesty for a good discussion of what is/isn't considered honest collaboration. The exams are to be done solo, but will be open book, open notes, and open manual. :-) UCI's academic honesty policy applies both to person-to-person and social media interactions (more on this below).

Late Policy

Due dates will be clearly indicated on all HW assignments. Assignments will still be accepted for up to one day (24 hours) after the due date, but you will lose 10 points per day (out of the 100-point total per assignment) for lateness. We will not accept assignments in any form whatsoever after that time. "Stuff happens" sometimes, as we all know, so you should anticipate that reality and avoid working up until the very last minute. Assignments MUST be turned in per the assignment's instructions by the indicated deadline date/time in order to get full credit. (Note: The best-7-out-of-8 grading policy is another fallback plan for what would otherwise be late homework assignments; that will work once, but obviously only once.) We will release an official solution for most assignments 24 hours after its due date/time; as a result, we have to stop accepting your solutions at that time.

Exam Timing Policy

The exam dates are provided at the start of the term. These dates are not flexible and makeup exams will not be offered, so please avoid scheduling interviews, mini-vacations, or any other activities in ways that will interfere with exam-taking. All students will be asked to take the exam on the same date in the same time window(s) in order to ensure that we can give, grade, and fairly curve the entire class on a single exam. Note that we will be avoiding Friday midterm exam dates to make this policy easier to adhere to. The exams will be online this quarter, and given using Gradescope; we may end up offering two time windows for each exam in order to accommodate students in remote timezones. (We will see what the demand is there.)

Academic Honesty Policy

Cheating is the one area where the mostly empathetic instructor for this course has zero patience or sympathy. You are here at UCI to learn, and cheating totally defeats that purpose. If you cheat in real life, on the job, you could actually bring down a company - so I do my part to avoid graduating students who would do that. All students are expected to adhere to the UCI/ICS Academic Honesty policies (see http://www.editor.uci.edu/catalogue/appx/appx.2.htm#academic and http://www.ics.uci.edu/ugrad/policies/index.php#academic_honesty to read their details). Any student found to be involved in cheating or aiding others in doing so will be academically prosecuted to the maximum extent possible: you could potentially fail this course in its entirety. (Ask around - I've done it.) Just say no to cheating!!! By the way, it is fine if you look at old CS122A exams, old assignments, old quizzes, old solutions, etc. (Such materials will help you practice and learn!) Again, your goal should be to learn - and in this course we will ask you to please be on the honor system and behave accordingly! And with the test-like way that many companies interview prospective employees these days (which I'm not a fan of), you simply won't get in the door if you don't know your stuff.


DBMS Platform

This class will use an industrial-strength relational database management system (RDBMS) for the hands-on homework assignments. In some past terms we have allowed students to choose between MySQL (an open source DBMS), DB2 (from IBM), and/or another RDBMS of their choosing. This quarter we will be suggesting that everyone use the same system, namely MySQL. (This is a good RDBMS to have on your resume.)

For info on how to set up and use MySQL, please see: [MySQL Installation Guide for Mac], [MySQL Installation Guide for Windows], [MySQL Installation Guide for Linux], [Sample SQL Script used in the guide]. You should grab the latest version (8.0) to ensure that you have all the features available that we will make use of.
For info on how to use MySQL's Command-Line Tool, please see: [MySQL Command-Line Tool Guide for Mac], [MySQL Command-Line Tool Guide for Windows] [Sample SQL Script used in the guide].
For additional info on how to use the MySQL RDBMS, you should look here: https://dev.mysql.com/doc/. You will also find that Google searches are a great source of MySQL information and issues/solutions, as there are lots and lots of MySQL users out there in the Googlesphere.

If you have a strong preference for using another database system for the course project, e.g., because you are working elsewhere on- or off-campus with a different RDBMS (e.g., PostgreSQL), you may e-mail an instructor with a request for permission to do so. If you opt to do this, you will then be totally on your own to make sure everything works right within the prescribed timeframes (i.e., DBMS difficulties will not be accepted as a lateness excuse).


CS122A Online? For Real?

Yep! The Great Online Database Adventure begins...! The good news is that the size of the Spring offering of CS122A has been spiraling out of control for several years now, so much of what we do was already largely happening online. The biggest change this time around will be that the lectures will be prerecorded (using Yuja and distributed via Canvas). They will be available for viewing starting at the appointed lecture time, but will be available for later viewing or viewing in other timezones thereafter. The instructor will hang out online on Piazza during the official lecture time, watching for questions that might pop up in real-time. The plan is that the three exams will be offered and taken online using Gradescope's new (beta) timed online assignment feature. The discussion sessions will be held on Zoom, but will also be recorded for later viewing in other timezones. (More details on Discussion sessions will follow once we get some sense of where around the globe folks will be joining from.) PDF lecture slides and quiz solutions will be made available rapidly after the relevant sessions for later reference as well.

For all of this to work, you will need a laptop capable of running MySQL, workable Internet access, and the ability to access (or forward) UCI e-mail, Piazza, Gradescope, and Canvas+Yuja. If this is not you, this unfortunately won't be a good quarter for you to take this class, as we won't be able to provide workarounds for those dependencies. (Sorry!) You should verify your ability to meet these requirements during the first week of class, before committing to it for sure. (Last I knew there was still a waiting list for this class, so it's important to make space for those who are indeed able to fully participate online.)

Note: This will not be like a real, well-designed, smoothly orchestrated "we meant to do this online" class. This will be us doing our best to give you the best possible CS122A educational experience during an unprecedented time when we cannot meet in person. Please be patient and set your expectations accordingly! Don't be surprised if the lecture quality is "earthy" and my self-appointed "emotional support cat" wanders through the lecture scenery at some point... (So far I am fighting a losing battle for at-home desktop real estate with this particular cat...) That being said, however, I am highly optimistic - I do think that we will indeed be able to deliver a full-quality version of CS122A this way! Time will tell...


Discussion Forums for All Things CS122A

We will be using Piazza (heavily!!!) for online class discussions. Piazza aims to get you the help you need fast and efficiently from classmates, the TAs, the Reader, and the Instructor. Rather than emailing your course or HW content questions to the teaching staff, you will be expected to post your questions on Piazza. (If you have any problems or feedback for the Piazza developers, email team@piazza.com.) We've used Piazza for this class multiple times before, and it works remarkably well. You can find our class Piazza page at http://piazza.com/uci/spring2020/cs122a/home. With over 400 students enrolled in the course, even when it is offered face-to-face, Piazza is essentially our only hope for managing the class and getting everyone's questions answered -- office hours become hopeless at this scale, unfortunately. Your Piazza activity will also contribute 2% to your overall grade -- in particular, note that misbehavior on Piazza may lead to loss of points.

When using Piazza, there are a few things to keep in mind. Firstly, despite an unfortunate national trend towards the acceptance of nastiness in social media, hurtful Piazza behavior will not be tolerated - so be kind to your classmates. Secondly, Piazza is a wonderful resource for asking and answering questions - as long as it is used thoughtfully. Please avoid re-asking questions that have already been asked and answered - you are responsible for reading others' questions and not re-asking them. With 400+ eyes on each message, lazy question-asking on Piazza is costly and inconsiderate, and such behavior may lead to loss of participation points. Finally, Piazza is not a place to discuss the details of answers to HW problems - i.e., it is not a place to post, request, or compare answers to specific assigned problems! Doing so would actually risk your violating the Academic Honesty Policy, as everyone is expected to ultimately do their own work. (Piazza is a fine place to chat about answers to old sample exams, however.)

As you are aware, there is a weekly Discussion session meeting on the books, and attendance of those is highly encouraged. This encouragement will be provided via the quiz portion of your final grade (3%), as quizzes will be linked to the Discussion sessions. If for some reason you have to miss a session during one week, you can request permission (in advance!) to attend a different one - but please try to avoid that for Zoom load-balancing reasons. You will be held fully responsible for any and all material covered in your Discussion sessions, so if you do opt out or miss one, be sure to watch the Zoom recordings that we hope to make of each session. The Discussion sessions will also serve (importantly!) as "group office hours" - we will aim to leave ample time for questions most weeks, so you may want to come armed with things you'd like to have clarified or expanded on.


Homework Assignments

Due Date Topic Assignment Solution Other Notes
Fri, Apr 10 (11:00 PM PDT) E-R Modeling HW1 Assignment Template HW1 Solution
Fri, Apr 17 (11:00 PM PDT) E-R to Relational Translation HW2 Assignment Template Sample Script HW2 Solution
Fri, Apr 24 (11:00 PM PDT) Relational Design Theory HW3 Assignment Template HW3 Solution
Mon, May 4 (11:00 PM PDT) Relational Algebra HW4 Assignment RelaX Instructions Template HW4 Solution
Mon, May 11 (11:00 PM PDT) SQL (Hands-on) HW5 Assignment Load Instructions Load Script Template HW5 Solution
Mon, May 18 (11:00 PM PDT) More SQL (Hands-on) HW6 Assignment Load Script Template HW6 Solution
Wed, May 27 (11:00 PM PDT) SQL Design and Indexing HW7 Assignment Load Script Template HW7 Solution
Wed, Jun 3 (11:00 PM PDT) NoSQL (Hands-on) HW8 Assignment AsterixDB Instructions Scripts Template HW8 Solution

Discussion Section Quizzes

Week Topic Quiz Solution
Week 1 Academic Honesty Quiz 1 Quiz 1 Solution
Week 2 E-R Modeling Quiz 2 Quiz 2 Solution
Week 3 E-R to Relational Translation Quiz 3 Quiz 3 Solution
Week 4 Relational DB Design Theory Quiz 4 Quiz 4 Solution
Week 5 Relational Algebra Quiz 5 Quiz 5 Solution
Week 6 SQL & TRC Quiz 6 Quiz 6 Solution
Week 7 More SQL Quiz 7 Quiz 7 Solution
Week 8 ISAM & Indexing Quiz 8 Quiz 8 Solution
Week 9 Indexing & Physical DB design Quiz 9 Quiz 9 Solution
Week 10 Physical DB design, NoSQL, Transactions Quiz 10 Quiz 10 Solution
Last modified 5 years ago Last modified on Dec 7, 2020, 11:58:38 AM

Attachments (107)

Note: See TracWiki for help on using the wiki.