Stat 133: Concepts in Computing with Data is an introductory course to computational data analysis with an emphasis on five major cornerstones:
- 🔧 Data Manipulation (wrangling, reshaping, tidying)
- 📊 Data Visualization (focus on statistical charts)
- 💻 Programming Concepts (with emphasis on data analysis)
- 🚀 Data Technologies (various sources/formats of data)
- 📑 Reporting Tools (via dynamic documents)
Because Stat 133 is one of the core courses of the Statistics Department, the underlying goal is to provide foundations for "computing with data" so that statistics-major students have the basic computational skills for subsequent upper division courses (e.g. Stat 150, 151A, 152, 153, 154, 155, 157, 158, 159). This involves teaching students how to:
- understand common data formats,
- use the computer extensively to conduct statistical analysis of data,
- understand how to visualize data and display statistical information,
- learn the basic principles for writing code,
- organize your workflow,
- focus on aspects of computing to conduct data analysis, NOT the computational aspects of statistical methods,
- use computational tools to carry out the data analysis cycle.
This course does not have any prerequisites, although it would be nice if you have taken an introductory course in statistics (e.g. Stat 2, 20, 21, 131A).
The curriculum and format is designed specifically for students (ideally majoring in Statistics) who have NOT taken computer science courses. You don't need previous programming experience, and you also don't need previous data analysis experience. However, students with some exposure to programming concepts, and data analysis tend to understand certain concepts better.
Students with some prior experience in either computational statistics or computing are welcome to enroll, though some parts of the course will feel extremely slow. Students who have taken computer science courses (e.g. CS C8, CS C100, CS 9H, CS 61a, CS 61b) should instead take a more advanced course.
We expect that, at the end of the course, you understand the general principles of data analysis projects, and the three stages of the Data Analysis Cycle (DAC). This means that you can take a data set (in some common format), clean it, tidy it, get visualizations, write code, and report the results in a varied number of formats.
We don't expect you to become a jedi data scientist, an R ninja, or a super coder. That takes YEARS of practice, training, learning, and collaborating. Instead, we want you to become a skilled padawan analyst (which, if interested, can be prepared to take the next steps of a data science marathon race).
We will be using a combination of materials such as slides, tutorials, computer labs, reading assignments, open books, and chalk-and-talk.
The main computational tool will be the computing and programming environment R. The main workbench will be the IDE RStudio. You will also use a command line interface to interact with your operating system. Likewise, you will use Git to version control some assignments, and submit them to Github.
The rest of this document details the policies that will be enforced in the Spring 2019 offering of this course. These policies are subject to change until the beginning of the semester and throughout the remainder of the course, at the judgement of the course staff.
- Lecture is meant to discuss concepts and fundamentals of computing with data.
- Attending lecture is part of your grade.
- We will track attendance with short questions asked during lecture (using google forms).
- Due to the ever present logistical issues at the beginning of every semester, we will start tracking attendance on Feb-4 (i.e. about 35 days of lecture).
- To receive full credit, you need to be present and answer at least 75% of asked questions throughout the semester. Otherwise, you won't receive credit for participation (i.e. all or nothing).
- Weekly labs are a required part of the course and they are meant to supplement lecture. We strongly encourage you to attend lab.
- You must attend the discussion group you are officially registered in.
- Do not take the class if you cannot attend the discussion you are registered in.
- Lab assignments will be released every week, and their solutions will be posted on bCourses after submission deadlines.
- Due to the ever present logistical issues at the beginning of every semester, you will have till Feb-17 to submit the first three labs to bCourses.
- You will be encouraged to problem solve individually or in groups.
- Each person must submit each lab independently to bCourses, but you are welcome to collaborate with other students in your lab room.
- Late submissions of labs will not be accepted under any circumstances. Please see the policy about Special Accommodations (below) if you have relevant DSP accommodations.
- In lieu of offering exceptions or extensions, your lowest lab score will be dropped in the calculation of your overall grade.
- If you finish the lab early, we encourage you to help others with their lab.
- You can get credit for each lab in one of two ways described below:
- Attend your lab section, make progress substantial enough for your work to be checked off by course staff, and submit your lab (even if it is incomplete) by the end of the lab period.
- Complete the lab on your own and submit the completed lab by Thursday morning at 8:59am. If you choose this route, you must finish the entire lab. This option is not encouraged by the course staff, and is only recommended if you are sure that you will not be able to make lab a certain week.
- There are two kinds of assignments:
- warmup assignments
- workout assignments
- Homework assignments are NOT eligible for regrades.
- The homework assignments will get substantially more difficult as we progress with the course.
- Please plan ahead and pace yourself. Don't wait until the last day to do an assignment. Don't wait until the last minute to submit your assignments.
- If you collaborate with other students when working on a HW assignment, please include the names of those students in your submission.
- You must write your own answers (using your own words). Copy and plagiarism will not be tolerated (see Academic Honesty policy).
- One type of assignments consists of so-called "warmup" assignments.
- Roughly speaking, a warmup assignment is a relatively simple piece of work.
- They should allow you to acquire the basic skills that you will later apply on the so-called workout assignments.
- Due to the ever present logistical issues at the beginning of every semester, you will have till Feb-17 to submit the first three warmups to bCourses.
- After Feb-17, no late warmup assignments will be accepted under any circumstance. Please see the policy about Special Accommodations (below) if you have relevant DSP accommodations.
- Warmup assignments will be graded on correctness typically by selecting some problems (e.g. most challenging parts of the assignment, specific details or instructions).
- The applicable grading scheme for each assignment will be announced on bCourses.
- The second type of assignments are what we call "workout" assignments.
- You will submit your workout assignments to your private Github classroom repository.
- You should also submit the github's link of your assignment to bCourses.
- Workout assignments will be accepted up to 2 days (48 hours) late:
- a workout project submitted to github less than 24 hours after the deadline will receive a 1/3 deduction,
- a workout project submitted to github between 24 and 48 hours after the deadline will receive a 2/3 deduction,
- a workout project submitted to github 48 hours or more after the deadline will receive no credit.
- Please see the policy about Special Accommodations (below) if you have relevant DSP accommodations.
- There will be one 50-minute in-class midterm, and one 3-hour final exam.
- The midterm exam will be held on Friday, March 8th, during class.
- The final exam will be held on Wednesday, May 15th, at 7-10pm (as scheduled by the university). Rooms will be announced closer to the date.
- Unless you have accommodations as determined by the university and approved by the instructor, you must take the midterm and the final at the dates and times provided here. Please check your course schedule and make sure that you have no conflicts with these exams. If you have a conflict with either exam, please contact the Instructor before the end of the second week of classes (before Feb-01). Otherwise, do not take the class if you are not available at these dates and times.
- We will use gradescope to grade the tests.
- You will have three days after grades are published on gradescope to request a regrade for the midterm. The final test is NOT eligible for regrades.
- Please only ask for regrades if you see an error in the grading rather than a dispute with the rubric.
- During the regrading process you can lose points, even for questions you did not ask to be regraded.
- Regrades are based on what you actually wrote, not on what you were thinking, or what you meant to say, or what you assumed or failed to assume.
- After the regrade deadline, no requests will be considered, even if there was an error in the grading.
- I reserve the right to have a meeting with you and evaluate your understanding of the material in person (oral examination, whiteboard coding, and live coding).
- Testing in person can happen haphazardly (most of the time), but it can also occur based on special circumstances (e.g. grad students, strong programming background, allegedly cheating/plagiarism, unusual low or high performance).
- Oral evaluations may be comprehensive.
- The purpose of this evaluation is for me to have a more direct way of assessing your entire work (e.g. knowledge, skills, thinking process, problem solving).
Grades will be assigned using the following weighted components:
Concept | Weight |
---|---|
Class participation | 5% |
Lab work | 10% |
Warmup assignments | 20% |
Workout assignments | 30% |
Midterm | 10% |
Final | 25% |
- No individual letter grades will be given for midterm, or final.
- You will get a letter grade for the course that is based on your overall score.
- The course will not be curved, but details of grading criteria will not be announced in advance.
- Letter grades are final; I don't enter into negotiations with students about grades.
- Please do not embarrass yourself and me by begging for extra credit or late submissions after final grades have been awarded.
- Also, please remember that I grade your performance, not your personal worth.
With the obvious exception of exams, we encourage you to discuss all of the course activities with your friends and classmates as you are working on them. You will definitely learn more in this class if you work with others than if you do not. Ask questions, answer questions, and share ideas liberally.
Cooperation has a limit, however. You should not share your code or answers directly with other students. Doing so doesn't help them; it just sets them up for trouble on exams. Feel free to discuss the problems with others beforehand, but not the solutions. Please complete your own work and keep it to yourself. If you suspect other people may be plagiarizing you, let us know ASAP. For more information please read the Honor Code Guide for Syllabi.
We expect you to do your own work and to uphold the standards of intellectual integrity. Collaborating on homework is fine and I encourage you to work together---but copying is not, nor is having somebody else submit assignments for you. Cheating will not be tolerated. Anyone found cheating will receive an F and will be reported to the Center for the Student Conduct. If you are having trouble with an assignment or studying for an exam, or if you are uncertain about permissible and impermissible conduct or collaboration, please come see me with your questions.
Rather than copying someone else's work, ask for help. You are not alone in this course! The course staff is here to help you succeed. If you invest the time to learn the material and complete the projects, you won't need to copy any answers.
- You should try to use email as a tool to set up a one-on-one meeting with me if office hours conflict with your schedule.
- Use the subject line Stat 133 Meeting Request.
- Your message should include at least two times when you would like to meet and a brief (one-two sentence) description of the reason for the meeting.
- Do NOT expect me to reply right away (I may not reply on time).
- If you have an emergency, talk to me later during class or office hours.
- I strongly encourage you to ask questions about the syllabus, covered material, and assignments during class time or lab discussions.
- I prefer to have conversations in person rather than via email, thus allowing us to get to know each other better and fostering a more collegial learning atmosphere.
- In case of cheating/plagiarism suspicion, I do not discuss things by email or bCourses.
Students needing accommodations for any physical, psychological, or learning disability, should speak with me during the first two weeks of the semester, either after class or during office hours and see http://dsp.berkeley.edu to learn about Berkeley’s policy. If you are a DSP student, please contact me at least three weeks prior to a midterm or final so that we can work out acceptable accommodations via the DSP Office.
For relevant DSP accommodations that provide occasional extensions on assignments, we may provide a two-day extension as long as you contact us before the assignment is due. More details about these consideratinos may be discussed with the DSP staff.
If you are an athlete or Cal band member, please check your calendar. Do not take the class if you are not available to take the midterm, final, and/or attend lab discussions. I won't be able to provide accommodations for an early or late exam. Since this is a hands-on programming-based class, I don't allow coaching staff proctoring.
Under emergency/special circumstances, students may petition me to receive an Incomplete grade. By University policy, for a student to get an Incomplete requires (i) that the student was performing passing-level work until the time that (ii) something happened that---through no fault of the student---prevented the student from completing the coursework. If you take the final, you completed the course, even if you took it while ill, exhausted, mourning, etc. The time to talk to me about incomplete grades is BEFORE you take the final (several weeks before), when the situation that prevents you from finishing the course presents itself. Please clearly state your reasoning in your comments to me.
It is your responsibility to develop good time management skills, good studying habits, know your limits, and learn to ask for professional help. Life happens. Social, family, cultural, scholar, and individual circumstances can affect your performance (both positive and negatively). If you find yourself in a situation that raises concerns about passing the course, please come see me as soon as possible.
Above all, please do not wait till the end of the semester to share your concerns about passing the course because it will be too late by then.
Whenever a faculty member, staff member, post-doc, or GSI is responsible for the supervision of a student, a personal relationship between them of a romantic or sexual nature, even if consensual, is against university policy. Any such relationship jeopardizes the integrity of the educational process.
Although faculty and staff can act as excellent resources for students, you should be aware that they are required to report any violations of this campus policy. If you wish to have a confidential discussion on matters related to this policy, you may contact the Confidential Care Advocates on campus for support related to counseling or sensitive issues. Appointments can bemade by calling (510) 642-1988.
The classroom, lab, and work place should be safe and inclusive environments for everyone. The Office for the Prevention of Harassment and Discrimination (OPHD) is responsible for ensuring the University provides an environment for faculty, staff and students that is free from discrimination and harassment on the basis of categories including race, color, national origin, age, sex, gender, gender identity, and sexual orientation. Questions or concerns? Call (510) 643-7985, email [email protected], or go to http://survivorsupport.berkeley.edu/.
The main goal of Stat 133 is that you should learn, and have a fantastic experience doing so. Please keep that goal in mind throughout the semester.