# Syllabus

This web page will serve as the syllabus for the course. Please read it carefully. You should become familiar with these policies. To do so, you will likely need to return to the syllabus several times throughout the semester. After the start of the semester, August 26th, this document may continue to be updated. Any such changes will be announced in class.

# Course Name and Number

• Main: STAT 432 - Basics of Statistical Learning
• Cross-list: ASRM 451 - Basics of Statistical Learning

For simplicity, the course staff will exclusively refer to the course as STAT 432.

# Course Objectives

After this course, students should be expected to be able to …

• identify supervised (regression and classification) and unsupervised (clustering) learning problems.
• understand the fundamental theory behind statistical learning methods.
• implement learning methods using a statistical computing environment.
• formulate practical, real-world, problems as statistical learning problems.
• evaluate effectiveness of learning methods when used as a tool for data analysis.

# Course Content

Tentative subjects include:

• Basics: Supervised and Unsupervised Learning, Parametric vs Non-Parametric Methods, Bias-Variance Trade-Off, Cross-Validation, Model Selection
• Regression: Linear Regression, Trees, KNN, Penalized Regression
• Classification: Logistic Regression, Trees, KNN, LDA, QDA, Naive Bayes, Bagging, Boosting, Random Forests
• Modern Methods: Regularization, Ensemble Learning
• Unsupervised: PCA, K-Means Clustering, Hierarchical Clustering, Mixture Models, EM Algorithm

# Prerequisites

A course which covers linear regression that uses R, such as STAT 420 or STAT 425. Basic knowledge of probability and linear algebra is also assumed.

# Course Communications

We will use several forms of communication for this course. The website will be the one-stop-shop for all course information. Compass will be used to send announcements via email. These will be archived on Compass.

If you would like to communicate with the course staff, our preferred methods of communication, in order, are:

1. Class
2. Office Hours
3. Piazza
4. Email

## Piazza

This course will use Piazza for course communications.

The course staff would strongly prefer the use of Piazza to GroupMe. The course staff feels that a GroupMe may exclude members of the course, whereas all are welcome on Piazza.

## Email Policy

Due to the large size of this course, we follows a strict email policy. Instead of email, consider Piazza! Quick, non-private communication should take place there. Please do not over-share code.

If you’d like to email the instructor or course staff, consider the following:

• Is your question about a “homework” exercise? First and foremost: You should ask it in class or office hours. After that, consider Piazza. As a last resort, use email.

If you choose to send an email, you must follow the following three rules. If you do not, your email will be considered less import than other emails which follow the rules and response time will be slower.

• All email must originate from an @illinois.edu email address or appear as sent on behalf or an @illinois.edu address.
• Depending on the situation, failure to follow this rule may make a response impossible.
• Your subject line must begin with exactly the following: [STAT 432]
• While ASMR 451 is a valid cross-listed course, please use STAT 432 for all communication purposes.
• After the above, put a single space, followed by a useful but short description of your message.
# good

# bad - improper format, non-descriptive subject
[stat432] hi

# bad - improper format, subject too long, information found in syllabus or website
[STAT 432]when is the first CBTF exam and what is covered on the exam?

If your email is sent between 9:00 AM Monday and 11:59 PM Thursday, and you follow the above directions, we will try our best to respond within 24 hours. Questions about an assessment sent the same day the assessment is due will likely not receive a response before the assessment is due. Plan accordingly.

## Code Discussion

If your question is technical in nature, there are several steps you can take to insure a speedy response on Piazza or in email.

First and foremost, you should ask Google before you ask the course staff. Take the error message you obtained and search it with Google. The ability to solve problems this way is an extremely value skill, possibly one of the most important you should learn (but are not taught) during your academic career. Make a legitimate effort to solve the problem on your own. You won’t always be able to, and if you can’t, send an email. (Or better yet, stop by office hours.)

If you need to ask the course staff, include the following in your Moodle post or email:

• The offending line of code, as well as a few previous lines for context.
• The exact error message received.
• Attach the file containing the code, zipped with any external files needed to run the code. The easier you make it to recreate the error, the more likely course staff is to help, and find a solution. If we can’t recreate the error, we can’t fix it! (Only do this in emails to the instructor. Do not do this on Moodle.)

If you use RStudio Cloud as recommended, you may allow the course staff to directly view your code.

## Course Staff Emails

Role Name Email
Instructor David Dalpiaz
Teaching Assistant Mengchen Wang
Teaching Assistant Zihe Liu

# Office Hours

This course will use the Queue system for office hours.

If you would like to schedule a private meeting outside of regular office hours, please send an email suggesting two possible times, on two different days. (A total of four suggested times.) I have a preference for time-slots directly adjacent to current office hours or directly after a class. Please also indicate a brief agenda for the meeting.

# Assessments

## PrairieLearn Quizzes

Throughout the semester, there will be numerous quizzes administered through the PrairieLearn system. These will be low-stakes, nearly unlimited attempt quizzes. These quizzes will serve as practice for exams. By completing them early, you will earn buffer points. (Buffer points will allow you to obtain over 100% for a particular assignment, but your percentage on quizzes overall cannot exceed 100%.)

There will be at least 15 quizzes and at most one quiz every Monday and Wednesday. No quizzes will be dropped. Instead, there will be opportunity for buffer points with each quiz. Generally, buffer points will be obtained if you achieve a score of 100% within the first week the quiz is released.

### PrairieLearn

Quizzes and exam will both use the PrairieLearn system. Use the link below to sign-up and add STAT 432.

## CBTF Exams

Exams in the course will be scheduled and administered through the Computer-Based Testing Facility. Dates of these exams can be found at the CBTF website. Exam material will be based on previous quiz material.

### Computer-Based Testing Facility

This course uses the College of Engineering Computer-Based Testing Facility (CBTF) for its quizzes and exams: https://cbtf.engr.illinois.edu.

The policies of the CBTF are the policies of this course, and academic integrity infractions related to the CBTF are infractions in this course.

If you have accommodations identified by the Division of Rehabilitation-Education Services (DRES) for exams, please take your Letter of Accommodation (LOA) to the CBTF proctors in person before you make your first exam reservation. The proctors will advise you as to whether the CBTF provides your accommodations or whether you will need to make other arrangements with your instructor.

Any problem with testing in the CBTF must be reported to CBTF staff at the time the problem occurs. If you do not inform a proctor of a problem during the test then you forfeit all rights to redress.

## Practice Data Analyses

There will be four practice data analyses throughout the semester. These assignments will be self-graded and students will write a self-reflection. Specific policies and directions will be released with each analysis.

## Data Analyses

There will be four data analyses throughout the semester. These assignments will be staff graded. Specific policies and directions will be released with each analysis.

## Group Final Project

The final project will consist of a group data analysis. The project will consist of several assignments including but not limited to: a project proposal, a draft report, a final report, and peer review. Due dates, assignment details, and group assignments will be announced after the midpoint of the semester.

Graduate students will be required to complete a small additional project, which will likely take the form of a simulation study. Undergraduate students will receive a 100% without completing this project, but are still encouraged to give it a try!.

# Course Technology

R and RStudio are required software for this course.

• R is a freely available language and environment for statistical computing and graphics.
• RStudio is a free and open-source integrated development environment for R.

To use both, we will require the use of RStudio Cloud. Register here using your @illinois.edu email address. (Failure to use your @illinois.edu account will result in your RStudio Cloud account being deleted. You will then need to re-register.)

Assessment Percentage
PrairieLearn Quizzes 20
CBTF Exam I 10
CBTF Exam II 10
CBTF Exam III 20
Practice Data Analyses (4) 10
Data Analyses (4) 10
Group Final Project 15

Exam III may replace Exam I or Exam II if the score for Exam III exceeds the score for either Exam I or II. (both may be replaced.) The average of Exam I and Exam II can replace half of Exam III if the average score for Exam I and II exceeds a student’s score on Exam III. That is,

$50\% \cdot \frac{(\text{Exam I } + \text{Exam II})}{2} + 50\% \cdot (\text{Exam III})$

Two examples:

Assessment Before Replacement After Replacement
CBTF Exam I 100 100
CBTF Exam II 90 90
CBTF Exam III 70 82.5
Assessment Before Replacement After Replacement
CBTF Exam I 60 90
CBTF Exam II 99 99
CBTF Exam III 90 90

In addition to the required coursework, there will be two assessments early in the semester that will count as buffer points.

Assessment Percentage Buffered
CBTF Syllabus Exam 2 CBTF Exam I
Intro Data Analysis 5 Data Analysis I

These percentage points are not extra credit. They are buffer points for the percentage of the individual assignment, not the weight towards the course grade.

For example, suppose a student obtains a 90% on the CBTF Syllabus Exam, then obtains a 99% on CBTF Exam I. This student’s resulting grade on CBTF Exam I would be 100 because buffer points cannot be used to obtain a score greater than 100.

$0.90 \cdot 2 + 99 = 100.8$

## Learning Management

Compass2g will be used to distribute grades and for assignment submission.

A+ A A- B+ B B- C+ C C- D+ D D-
TBD 93% 90% 87% 83% 80% 77% 73% 70% 67% 63% 60%

The instructor reserves the right to lower, but not raise, grade cutoffs. (However, this policy should not create an expectation that this will happen. Asking for a change in cutoffs will make any change in cutoffs less likely.) The grade of A+ will be reserved for the top three students in each section.

Grading in the course is not competitive. There is nothing (other than some statistical realities) that would prevent the entire class from receiving a grade of A.

If you feel an assessment was graded incorrectly, you have one week from the date you received a grade to discuss it with the instructor. After one week, grading is final except for exceptional circumstances. You may not simply ask for a re-grade, but instead must justify to the instructor why the grading was done incorrectly. By disputing any grading, you agree to allow the instructor to review the entire assessment in question for other errors missed during grading. Requests should be sent via email.

# Attendance

You are expected to attend all lectures. Failure to do so may not have a direct effect on your course grade, but will likely have a significant indirect effect. Any known or potential extracurricular conflicts should be discussed in person with the instructor during the first week of classes, or as soon as they arise.

The official University of Illinois policy related to academic integrity can be found in Article 1, Part 4 of the Student Code. Section 1-402 in particular outlines behavior which is considered an infraction of academic integrity. These sections of the Student Code will be upheld in this course. Any violations will be dealt with in a swift, fair, and strict manner. In short, do not cheat, it is not worth the risk. You are more likely to get caught than you believe. If you think you may be operating in a grey area, you most likely are.

Policies about specific assessment types will be released with directions for those assessments. Two heuristics to keep in mind:

• Do not share files with other students. Do not copy-paste from any source other than the course textbook and website.
• Use spoken language to exchange ideas, not code.

Under no circumstances should course materials be provided to Course Hero or any similar for-profit website. The course staff will seek the harshest possible academic integrity penalty for any students who do so.

# Disability Accommodations

To obtain disability-related academic adjustments or auxiliary aids, students with disabilities must contact the course instructor and the Disability Resources and Educational Services (DRES) as soon as possible. To contact DRES, you may visit 1207 S. Oak St., Champaign, call 217-333-4603, e-mail disability@illinois.edu or go to the DRES website.

To ensure appropriate accommodation is provided in a timely manner, please provide your Letter of Accommodation during the first week of class. Letters received after a relevant assessment has been administered will likely cause logistical issues that could result in an inability to accommodate.

# Changes

The instructor reserves the right to make any changes he considers academically advisable. Such changes, if any, will be announced in class. Please note that it is your responsibility to attend the class and keep track of the proceedings.

# Pardon Our Dust

This course is under active development. Most of the course materials are being written or re-written this semester. If you encounter an error or typo, please email the instructor as soon as possible. No error is too small to report!

Additionally, you may find “TODO” scattered throughout the notes, book, and website. If you encounter a “TODO” that you would like expanded, please let the instructor know as soon as possible.

# Diversity Statement

The University of Illinois is committed to equal opportunity for all persons, regardless of race, ethnicity, religion, sex, gender identity or expression, creed, age, ancestry, national origin, handicap, sexual orientation, political affiliation, marital status, developmental disability, or arrest or conviction record. We value diversity in all of its definitions, including who we are, how we think, and what we do. We cultivate an accessible, inclusive, and equitable culture where everyone can pursue their passions and reach their potential in an intellectually stimulating and respectful environment. We will continue to create an inclusive campus culture where different perspectives are respected and individuals feel valued.

# The Extended Syllabus

For some thoughts on teaching philosophy, some explanation of policies, and some general tips for success, please see The Extended Syllabus.

Home