Math 475: Statistical Computation
Fall 2016

Instructor: Todd Kuffner (kuffner@math.wustl.edu)

Lecture: 4:00-5:30pm, Monday/Wednesday, Simon Hall, Room 018

Office Hours: Monday 8:00-9:00am, Tuesday 4:00-6:00pm; Cupples I Room 18 (basement).

Course Overview: This course provides students with an introduction to the foundations of modern computational statistics.  Students will learn the basics of numerical analysis, random number generation, and computational tools for statistical inference, specifically Monte Carlo methods and the bootstrap. Students will be introduced to SAS during the first part of the course. Thereafter, students are welcome to use R or SAS (or both) for the relevant parts of homework assignments.

Prerequisite: It is assumed that students have taken a first course in multivariate-calculus-based-probability (including central limit theorems, laws of large numbers, transformations of variables), a first course in linear or matrix algebra, a course in statistics (including the principles of statistical inference, common estimation methods such as maximum likelihood), and have some familiarity with programming in either R or SAS.

Textbook: Both books are required. You may have electronic access through Washington University; log in to My Catalog on the library website and search for these books. Recommended readings for each lecture will consist of sections from these books.
Software: Washington University Arts & Sciences provides SAS through Remote Desktop. Instructions here. SAS is also available in some computer labs on campus (ask Arts & Sciences computing). Alternatively, students may obtain SAS University Edition. Later in the course, students may choose to use R, which is free and open source.

Homework: There will be regular homework assignments. For the first part of the course, you may find example code and data from the The Little SAS book here: http://support.sas.com/publishing/authors/delwiche.html . Homework will be graded, but solutions will not be provided to students.

Homework grader: Yiqian Fang (yiqianfang@wustl.edu)

Blackboard: During the semester, homework assignments, homework and midterm exam grades and any other course-related announcements will be posted to Blackboard or sent by email using Blackboard.

Attendance: Attendance is required for all lectures.  The student who misses a lecture is responsible for any assignments and/or announcements made.

Grades: 15% Homework, 20% Midterm 1, 20% Midterm 2, 45% Final

Exams: 2 in-class midterms and 1 final. The dates of the exams should not be considered fixed until the first day of class. What appears on Course Listings may be incorrect.

Homework: There will be weekly homework assignments. The lowest homework grade will be dropped. If you added the class late and missed the first homework, then that is the homework that will be dropped.

Final Course Grade: The letter grades for the course will be determined according to the following numerical grades on a 0-100 scale.
A+
impress me
(very rare)
B+
[87, 90)
C+
[77, 80)
D+
[67, 70)
F
[0,60)
A
93+
B
[83, 87)
C
[73, 77)
D
[63, 67)


A-
[90, 93)
B-
[80, 83)
C-
[70, 73)
D-
[60, 63)




Other Course Policies: Students are encouraged to look at the Faculty of Arts & Sciences policies.
Course Schedule: tentative; will be updated after lecture to reflect what was actually covered; LSB=Little SAS Book, CS=Computational Statistics

Lecture 1: Overview
Roles of estimation, simulation, and optimization in statistical inference
Reading : CS 1.1-1.8 (review of prerequisite material); LSB 1.1-1.13
HW1 assigned
Lecture 2: Computer Storage and Arithmetic
Fixed-point and floating-point number systems; errors
Reading: CS 2.1-2.3; LSB 2.1-2.21
Lecture 3: Algorithms and Programming I
Numerical errors; algorithms and data
Reading: CS 3.1-3.2; LSB 3.1-3.12
Lecture 4: Algorithms and Programming II
Efficiency
Reading: CS 3.3; LSB 4.1-4.24
HW3 assigned
Lecture 5: Algorithms and Programming III
Iterations and convergence; programming; computational feasibility
Reading: CS 3.4-3.6; LSB 4.1-4.24
Lecture 6: Function Approximation
Function approximation and smoothing; basis sets in function spaces
Reading: CS 4.1-4.2 (see Ch. 10 for perspective)
Lecture 7: Vector Spaces Review
Lecture 8: Function Approximation II
Review of Taylor series expansions for multi-variable functions; Inner products on function spaces; orthogonal polynomials; Applications of orthogonal polynomials (refinements of classical univariate central limit theorem with Edgeworth expansions in orthogonal Hermite polynomials)
Reading: CS 4.3-4.4 (see Ch. 10 for perspective)
HW3 due
Lecture 9: Function Approximation III
Splines
Reading: CS 4.4
Lecture 10: Review for first Midterm
Unconstrained descent methods in dense domains; unconstrained combinatorial and stochastic optimization
Lecture 11: Kernel methods
Reading: CS 4.5 (see Ch. 10 for perspective); LSB 5.1-5.13
Midterm 1 during class on Monday 10th October
Material: Lectures 1-10
Lecture 12: Introduction to integral approximation
Why must we approximate integrals? Liouville's theorem; Risch algorithm; statistical examples; overview of approximation methods
Reference: see slides
Lecture 13: Gaussian Quadrature
Reference: CS Ch. 4
Lecture 14: Basics of Bayesian Computational Statistics
Motivating uses of integral approximation in statistical inference; common setting of MCMC
Reference: slides
Lecture 15: Saddlepoint and Laplace Approximation
Deterministic integral approximation methods; Bayesian logistic regression example
Reference: slides
Lecture 16: Random Variable Generation; Monte Carlo Integration
Quantile transform method; rejection sampling; importance sampling
Reference: CS Ch. 7, 11 and Appendix A
Lecture 17: More on RNG; Intro to MCMC
Types of pseudo-random number generators; basics of Markov chain theory
Reference: CS Ch. 7, 11 and Appendix A
Lecture 18: More Markov chain theory; Metropolis-Hastings and Gibbs
Independent and random walk Metropolis-Hastings; optimal scaling and convergence diagnostics
Reference: CS Ch. 11 and slides
Midterm 2 during class on Wednesday 9th November
Material: Lectures 11-17
Lecture 19: MCMC Examples
Metropolis-Hastings and Gibbs examples; convergence diagnostics (Gelman-Rubin); writing R functions for MCMC
Reference: see Blackboard articles
Lecture 20: Introduction to Bootstrap
Nonparametric bootstrap; bias estimation and standard error estimation; jackknife
Reference: CS Ch. 12 and 13
Lecture 21: Bootstrap Confidence Intervals
Normal, percentile and bootstrap t intervals; BCa intervals; implementation in R
Reference: CS Ch. 12 and 13
Lecture 22: Cross-Validation and Permutation Tests
Rference: CS Ch. 12 and 13
Lecture 23: Current Research in Computational Statistics
Reference: slides
Last day of fall semester classes is 12/09
Final Exam is Friday 12/16, 6:00-8:00pm (see your exam schedule for the room)