Instructor: Todd
Kuffner
Lecture:
MWF 9:00-9:50am
Course
Description: Mathematical and statistical foundations
of data science. Core topics include: High-dimensional
probability; concentration of measure; matrix concentration
inequalities; essentials of random matrix theory; linear
dimension reduction. Other topics will be chosen by the
instructor.
List of Topics (tentative):
- Review of essential distribution theory, classical
inequalities and limit theorems for random variables
- Concentration inequalities for sums of random vectors
- Sub-gaussian, sub-exponential, and sub-Weibull
distributions
- High-dimensional random vectors: covariance matrices,
isotropic random vectors
- Random matrices: bounds and applications
- Concentration of measure on spheres
- Isometric embeddings, random projections, and the
Johnson-Lindenstrauss lemma
- Statistical estimation in high dimensions: oracle
inequalities
- Sparse recovery and harmonic analysis
Prerequisite:
Multivariable calculus (Math 233), linear or matrix algebra
(Math 429 or 309), and multivariable-calculus-based probability
and mathematical statistics (Math 493-494). Prior familiarity
with analysis, topology, and geometry is strongly recommended. A
willingness to learn new mathematics as needed is essential. In
particular, we will make heavy use of concepts related to vector
spaces and function spaces.
Textbook:
There is no required textbook for the course, though for some
parts of the course we will closely follow Roman Vershynin's
wonderful 2019 book,
High-Dimensional
Probability, published by Cambridge University
Press. The lectures and that book are the primary references for
the course, but some reference books as well as freely-available
references may also be suggested. Details will be posted on
Canvas.
Important
Dates and Course Schedule: Details will be posted on Canvas. I
will probably update the table below later in the semester to
detail what was covered for future reference.
Jan. 13
|
First day of classes
|
Jan. 20
|
No class (Martin Luther King Holiday)
|
Jan. 23
|
Last day to drop/add
|
March 9-13
|
No classes (Spring Break)
|
April 24
|
Last day of classes
|
Course Policies and Grades
Canvas:
During the semester, all course-related materials and
announcements will be posted to Canvas and/or sent by email to
registered students.
Grades:
Homework 25%, Midterm 20%, Final Exam 25%, Participation 10%,
Group Project & Presentation 20%
Homework: Roughly 1 homework for every 4-5 lectures.
You may discuss problems with other students, but the solutions
you submit must be entirely your own work. Explanations
detailing the steps of proofs or other mathematical arguments
are required for full credit. You are encouraged, but not
required, to write your solutions in TeX/LaTeX, and submit the
printed version. I will drop the lowest homework grade under the
condition that you have submitted all homeworks and genuinely
attempted all of the problems; I will not drop the lowest
homework grade if you did not do this.
Exams: There will be a take-home midterm exam and
a take-home final exam.
Participation:
Attendance and participation are required for all lectures.
Attendance is not enough. Participation includes: (i) answering
questions that I ask the class; (ii) providing a summary,
definition, or result from the previous lecture when I ask you
to.
Group Project & Presentation: Groups will be
assigned. Each group will be given a project, which may include
reading a paper on a topic not covered during lectures, or doing
a literature search on an open problem in the mathematical
foundations of data science. The group must submit a 5-10 page
report, written in LaTeX, and prepare a 25-minute presentation
for the rest of the class using slides (made with the Beamer
document class in LaTeX). The speaking roles in the presentation
must be shared equally with all members of the group. The final
report and presentation will be due during the final two weeks
of classes.
Final Course Grade: The letter grades for the course will be
determined according to the following numerical grades on a
0-100 scale.
A+
|
impress me
|
B+
|
[87, 90)
|
C+
|
[77, 80)
|
D+
|
[67, 70)
|
F
|
[0,60)
|
A
|
93+
|
B
|
[83, 87)
|
C
|
[73, 77)
|
D
|
[63, 67)
|
|
|
A-
|
[90, 93)
|
B-
|
[80, 83)
|
C-
|
[70, 73)
|
D-
|
[60, 63)
|
|
|
Other
Course
Policies: Students are encouraged to look at the
Faculty of Arts & Sciences
policies.
- Academic integrity:
Students are expected to adhere to the University's policy
on
academic integrity.
- Auditing: There is
an option to audit, but this still involves enrolling in the
course. See the Faculty of Arts & Sciences policy
on
auditing. Auditing students will still be expected to
attend all lectures and compete all required coursework and
exams. A course grade of 75 is required for a successful
audit.
- Collaboration:
Students are encouraged to discuss homework with one
another, but each student must submit separate solutions,
and these must be the original work of the student.
- Exam conflicts:
Read the University policy.
The exam dates for this course are posted before the
semester begins, and thus you are expected to be present at
all exams.
- Late homework:
Only by prior arrangement. If a valid reason for an
exception is not presented at least 36 hours before a
homework due date, then it will not be accepted late (a zero
will be given for that assignment).
- Missed exams:
There are no make-up exams. For valid excused absences with
midterm exams - such as medical, family, transportation and
weather-related emergencies - the contribution of that
midterm to the final course grade will be redistributed
equally to the other midterm exam and final exam. Students
missing both midterm exams and/or the final exam cannot earn
a passing grade for the course.