In addition to the course descriptions on this page, you can also view and download syllabi for previously offered courses.

## JAS1101H - Topics in Astrostatistics

This graduate-level course provides an introduction to the cross-disciplinary field of astrostatistics, and is intended for both astronomy and statistics students. We will cover topics in statistics (e.g., hierarchical Bayesian analysis, time series analysis, and cluster analysis) in the context of their applications to astronomical research (e.g., studies of galaxies, the Milky Way, exoplanets, and stellar populations).

These topics will be covered through two main aspects of the course: 1) peer-instruction and collaboration on a term project, and 2) readings, in-class discussion, and exercises related to current astrostats literature. For the term project, the students will develop practical skills by collaborating in cross-disciplinary teams on a research project in astrostatistics using real astronomical data.

## STA1001H – Methods of Data Analysis I

*(also offered as undergraduate course STA302H1)*

Introduction to data analysis with a focus on regression. Initial Examination of data. Correlation. Simple and multiple regression models using least squares. Inference for regression parameters, confidence and prediction intervals. Diagnostics and remedial measures. Interactions and dummy variables. Variable selection. Least squares estimation and inference for non-linear regression.

*Prerequisite: *

- STA238H1/STA248H1/STA255H1/STA261H1/ECO227Y1
- CSC108H1/CSC120H1/CSC121H1/CSC148H1
- MAT221H1(70%)/MAT223H1/MAT240H1

## STA1002H – Methods of Data Analysis II

*(also offered as undergraduate course STA303H1)*

Analysis of variance for one-and two-way layouts, logistic regression, loglinear models, longitudinal data, introduction to time series.

*Prerequisite:* STA1001H or equivalent

## STA1003H – Sample Surveys Theory

*(also offered as undergraduate course STA304H1)*

Design of surveys, sources of bias, randomized response surveys. Techniques of sampling; stratification, clustering, unequal probability selection. Sampling inference, estimates of population mean and variances, ratio estimation., observational data; correlation vs. causation, missing data, sources of bias.

*Exclusion: *STA322H1

*Prerequisite:* ECO220Y1/ECO227Y1/GGR270Y1 / PSY202H1/SOC300Y1/STA221H1/STA255H1/261H1/248H1

## STA1004H – Introductory Experimental Design

*(also offered as undergraduate course STA305H1)*

This cross-listed course covers a number of topics used in the design and analysis of experiments. The course is intended for students of statistics as well as students of other disciplines (eg. engineering, experimental science, etc.) who will use experimental design and analysis in their work.

The course will cover the following topics: randomization, blocking Latin squares, balanced incomplete block designs, factorial experiments, confounding and fractional replication, components of variance, orthogonal polynomials, response surface methods. Additional topics will be covered based on students’ interest as time permits.

*Prerequisite: *STA302H/352Y/ECO327Y/ECO357Y or permission of instructor

## STA1007H – Statistics for Life and Social Scientists

*(also offered as undergraduate course STA429H1)*

Consult the instructor for further details.

*Prerequisite: *Consult the instructor concerning necessary background for this course.

## STA1008H – Applied Statistics

Vocabulary of data analysis, Tests of statistical significance, Principles of research design, Introduction to unix, Introduction to SAS, Elementary significance tests, Multiple regression, Factorial ANOVA, Permutation tests, Power and sample size, Random effects models, Multivariate analysis of variance, Analysis of within-cases designs (repeated measures). If time permits, Categorical data analysis.

*Prerequisite: *Any introductory statistics class, taught by any department.

## STA2004H – Design of Experiments

A second course in design of experiments. Topics include: experiments vs observational studies, randomization and model-based inference, randomized blocks, Latin squares, incomplete block designs, factorial and fractional factorial designs, cross-over designs, confounding and aliasing, response surface designs and Taguchi methods, optimal design, Bayesian design.

*Prerequisite: *STA332H or equivalent

## STA2005H – Applied Multivariate Analysis

*(also offered as undergraduate course STA437H1)*

Practical techniques for the analysis of multivariate data; fundamental methods of data reduction with an introduction to underlying distribution theory; basic estimation and hypothesis testing for multivariate means and variances; regression coefficients; principal components and the partial multiple and canonical cor relations; multivariate analysis of variance; classification and the linear discriminant function. The use of R software should be expected.

*Prerequisite: *STA302H/352Y

*Recommended Preparation:* MAT223H/240H

## STA2006H – Applied Stochastic Processes

*(also offered as undergraduate course STA447H1)*

Discrete and continuous time processes with an emphasis on Markov, Gaussian and renewal processes. Martingales and further limit theorems. A variety of applications taken from some of the following areas are discussed in the context of stochastic modeling: Information Theory, Quantum Mechanics, Statistical Analyses of Stochastic Processes, Population Growth Models, Reliability, Queuing Models, Stochastic Calculus, Simulation (Monte Carlo Methods).

*Prerequisite:* STA347H or equivalent knowledge of probability theory; and MAT235Y/237Y or equivalent knowledge of multivariate calculus and basic real analysis.

## STA2016H – Theory & Methods for Complex Spatial Data

*(also offered as undergraduate course STA465H1)*

Data acquisition trends in the environmental, physical and health sciences are increasingly spatial in character and novel in the sense that modern sophisticated methods are required for analysis. This course will cover different types of random spatial processes and how to incorporate them into mixed effects models for Normal and non-Normal data. Students will be trained in a variety of advanced techniques for analyzing complex spatial data and, upon completion, will be able to undertake a variety of analyses on spatially dependent data, understand which methods are appropriate for various research questions, and interpret and convey results in the light of the original questions posed.

## STA2047H – Stochastic Calculus

Brownian motion, stochastic integrals, stochastic differential equations, diffusions, Cameron-Martin-Girsanov formula, diffusion approximations, applications. The course will be mathematically rigorous and self-contained.

*Prerequisite: *No explicit prerequisites, but to understand the material, it is necessary to have a good understanding at the advanced undergraduate level of at least **one **of the following:

- Probability,
- Real Analysis,
- Differential Equations,
- Mathematical Finance.

## STA2080H – Foundations of Statistical Genetics

*(also offered as undergraduate course STA480H1)*

*Course credit: 0.5 FCE*

## STA2101H – Methods of Applied Statistics I

*(this course is no longer cross-listed as an undergraduate course)*

Advanced topics in statistics and data analysis with emphasis on applications. Diagnostics and residuals in linear models, introductions to generalized linear models, graphical methods, additional topics such as random effects models, split plot designs, analysis of censored data, introduced as needed in the context of case studies.

*Prerequisite:*

- ECO374H1/ECO375H1/STA302H1
- STA305H1

## STA2102H – Computational Techniques in Statistics

*(also offered as undergraduate course STA410H1)*

The goal of this course is to give an overview of some of the computational methods that are useful in statistics. The rst part of the course will focus on basic algorithms, such as the Fast Fourier Transform (and related methods) and methods for generating random variables. The second part of the course will focus on numerical methods for linear algebra and optimization (for example, computing least squares estimates and maximum likelihood estimates). Along the way, you will learn some basic theory of numerical analysis (computational complexity, convergence rates of algorithms) and you will encounter some statistical methodology that you may not have seen in other courses.

*Prerequisites:* The nominal prerequisites for this course are MAT223H/240H, STA302H and CSC108H/120H/121H/148H these should give you the sucient background in both statistics and computer programming to handle the course material. A solid foundation in linear algebra is very useful for this course.

## STA2104H – Statistical Methods for Machine Learning and Data Mining

*(also offered as undergraduate course STA414H1)*

This course will consider topics in statistics that have played a role in the development of techniques for data mining and machine learning. We will cover linear methods for regression and classification, nonparametric regression and classification methods, generalized additive models, aspects of model inference and model selection, model averaging and tree bassed methods.

*Prerequisite: *either STA302H or CSC411H

## STA2105H – Nonparametric Methods of Inference

*(also offered as undergraduate course STA412H1)*

Modern methods of nonparametric inference, with special emphasis on bootstrap methods, and including density estimation, kernel regression, smoothing methods and functional data analysis.

*Prerequisite: *

- STA302H1
- STA352Y1

## STA2111H – Graduate Probability I

STA 2111H is a course designed for Master’s and Ph.D. level students in statistics, mathematics, and other departments, who are interested in a rigorous, mathematical treatment of probability theory using measure theory. Specific topics to be covered include: probability measures, the extension theorem, random variables, distributions, expectations, laws of large numbers, Markov chains.

Students should have a strong undergraduate background in Real Analysis, including calculus, sequences and series, elementary set theory, and epsilon-delta proofs. Some previous exposure to undergraduate-level probability theory is also recommended.

## STA2112H – Mathematical Statistics I

This course is designed for graduate students in Statistics and Biostatistics.

Review of probability theory, distribution theory for normal samples, convergence of random variables, statistical models, sufficiency and ancillarity, statistical functionals,influence curves, maximum likelihood estimation, computational methods.

*Prerequisite:*

- Advanced calculus (eg. MAT237)
- Linear algebra (eg. MAT223, MAT224).
- A previous course in probability and/or statistics is highly recommended.

## STA2162H – Statistical Inference I

*(also offered as undergraduate course STA422H1)*

Statistical inference is concerned with using the evidence, available from observed data, to draw inferences about an unknown probability measure. A variety of theoretical approaches have been developed to address this problem and these can lead to quite different inferences. A natural question is then concerned with how one determines and validates appropriate statistical methodology in a given problem. The course considers this larger statistical question. This involves a discussion of topics such as model specification and checking, the likelihood function and likelihood inferences, repeated sampling criteria, loss (utility) functions and optimality, prior specification and checking, Bayesian inferences, principles and axioms, etc. The overall goal of the course is to leave students with an understanding of the different approaches to the theory of statistical inference while developing a critical point-of-view.

*Necessary background: *Mathematics-based course on the theory of statistics (e.g., at the level of STA352Y).

## STA2201H – Methods of Applied Statistics II

The course will focus on generalized linear models (GLM) and related methods, such as generalized additive model involving nonparametric regression, generalized estimating equations (GEE) and generalized linear mixed models (GLMM) for longitudinal data. This course is designed for Master and PhD students in Statistics, and is REQUIRED for the Applied paper of the PhD Comprehensive Exams in Statistics. We deal with a class of statistical models that generalizes classical linear models to include many other models that have been found useful in statistical analysis, especially in biomedical applications. The course is a mixture of theory and applications and includes computer projects featuring R (S+) or/and SAS programming.

Topics: Brief review of likelihood theory, fundamental theory of generalized linear models, iterated weighted least squares, binary data and logistic regression, epidemiological study designs, counts data and log-linear models, models with constant coefficient of variation, quasi-likelihood, generalized additive models involving nonparametric smoothing, generalized estimating equations (GEE) and generalized linear mixed models (GLMM) for longitudinal data.

*Prerequisite:* Advanced Calculus, Linear Algebra, STA 347 and STA 422 (upper-division courses on probability and statistical inference) or equivalent, STA 302 (linear regression), Statistical Computing using R (S+) or/and SAS (alternative softwares are allowed). However, please be advised that I may not be familiar with the software of your choice resulting in limited assistance.

## STA2202H – Time Series Analysis

*(also offered as undergraduate course STA457H1)*

An overview of methods and problems in the analysis of time series data. Topics include: descriptive methods, filtering and adjustment, spectral estimation, bivariate time series models.

The course will cover the following topics:

- Theory of stationary processes, linear processes
- Elements of inference in time domain with applications
- Spectral representation of stationary processes
- Elements of inference in frequency domain with applications
- Theory of prediction (forecasting) with applications > ARMA processes, inference and forecasting
- Non-stationarity and seasonality, ARIMA and SARIMA processes

Further topics, time permitting: multivariate models; GARCH models; state-space models

## STA2209H – Lifetime Data Modeling and Analysis

Students interested in this may wish to take the course, Survival Analysis, offered by the Department of Public Sciences, Biostatistics program.

## STA2211H – Graduate Probability II

STA 2211H is a follow-up course to STA 2111F, designed for Master’s and Ph.D. level students in statistics, mathematics, and other departments, who are interested in a rigorous, mathematical treatment of probability theory using measure theory. Specific topics to be covered include: weak convergence, characteristic functions, central limit theorems, the Radon-Nykodym Theorem, Lebesgue Decomposition, conditional probability and expectation, martingales, and Kolmogorov’s Existence Theorem.

## STA2212H – Mathematical Statistics II

This course is a continuation of STA2112 and designed for graduate students in statistics and biostatistics.

*Topics include: *

- Bayesian methods
- minimum variance estimation
- asymptotic efficiency of maximum likelihood estimation
- interval estimation and hypothesis testing
- linear and generalized linear models
- goodness-of-fit for discrete and continuous data

*Prerequisite:* STA2112

## STA2453HY – Data Science Methods, Collaboration and Communication

This course is designed to provide graduate students with experience in statistical consulting. Students are active participants in research projects brought to the Statistical Consulting Service (SCS) of the Department of Statistics. The course is offered over the two sessions, fall (September-December) and winter (January-April). The overall workload is approximately equivalent to a half graduate course and students receive a half credit.

Students are not expected to have had any experience as consultants. The purpose of the course is to provide this experience so that graduates will be better able to function in such an environment when they have completed the course. The course also provides students with the opportunity to become familiar with statistical software packages such as The SAS System. There is supervision and assistance to novice consultants.

*Content: *There is some classroom instruction at the start of the term, an d meetings occasionally are called to discuss special topics and for students to compare experiences. Students serve as apprentice statisticians and work under the guidance of the instructor and the SCS Coordinator on individual projects. Projects are assigned to students as they come in to the SCS. There are periods of inactivity when there are no projects and other times are very busy. The pattern of work is more like that associated with a business or working environment than a traditional course. While some consideration is taken of other academic demands on students, those enrolling must be aware that work on projects may require precedence at times.

*Evaluation: *Students will be graded on the quality of their work as stati stical consultants. This involves the ability to do work in a timely fashion, the quality of advice provided and the quality of the presentation of advice and written work to clients.

*Prerequisite: *Students should have taken some applied sta tistics courses such as an undergraduate regression course. Also undergraduate courses in applied statistics, sample survey, design of experiments and time series analysis are recommended but these are not required. Also taking some of the other 2000 level applied statistics courses is recommended as this course will serve as an excellent opportunity to put the content of these courses to work.

## STA2500H – Loss Models

*(also offered as undergraduate course ACT451H1)*

Parametric distributions and transformations, insurance coverage modifications, limits and deductibles, models for claim frequency and severity, models for aggregate claims,stop-loss insurance, risk measures.

*Prerequisite: *Consult the instructor concerning necessary background for this course

## STA2501H – Mathematical Risk Theory

Consult the instructor for further details.

*Prerequisite:* Consult the instructor concerning necessary background for this course

## STA2502H – Stochastic Methods for Actuarial Science and Finance

*(also offered as undergraduate course ACT460H1)*

This course is an introduction to the stochastic models used in Finance and Actuarial Science. Students will be exposed to the basics of stochastic calculus, particularly focusing on Brownian motions and simple stochastic differential equations. The role that martingales play in the pricing of derivative instruments will be investigated. Some exotic equity derivative products will be explored together with stochastic models for interest rates.

*Prerequisite:*

- Knowledge of undergraduate probability theory is necessary.
- Knowledge of basic financial modeling (e.g., binomial trees and log-normal distributions) is useful, but not completely necessary.

## STA2503H – Applied Probability for Mathematical Finance

This course features studies in derivative pricing theory and focuses on financial mathematics and its applications to various derivative products. A working knowledge of probability theory, stochastic calculus (see e.g., STA 2502), knowledge of ordinary and partial differential equations and familiarity with the basic financial instruments is assumed.

The tentative topics covered in this course include, but is not limited to:

- no-arbitrage and the fundamental theorem of asset pricing,
- binomial pricing models;
- continuous time limits;
- the Black-Scholes model;
- the Greeks and hedging;
- European, American, Asian, barrier and other path-dependent options;
- short rate models and interest rate derivatives;
- convertible bonds;
- stochastic volatility and jumps;
- volatility derivatives;
- foreign exchange and commodity derivatives.

*More information:* Course Website STA 2503

*Prerequisite: * Knowledge of undergraduate probability theory is necessary. Knowledge of basic financial modeling (e.g., binomial trees and log-normal distributions), introductory stochastic calculus and financial products is useful, but not necessary. This course moves at a faster pace, is more advanced and contains a higher workload than STA2502, only students who are well prepared will be allowed to take this course. It is also distinct from STA 2047 which instead focuses on the mathematics of stochastic analysis.

## STA2505H – Credibility Theory & Simulation Methods

*(also offered as undergraduate course ACT466H1)*

Limited fluctuation or American credibility, on a full and partial basis. Greatest accuracy or European credibility, predictive distributions and the Bayesian premium, credibility premiums including the Buhlmann and Buhlmann-Straub models, empirical Bayes nonparametric and semi-parametric parameter estimation. Simulation, random numbers, discrete and continuous random variable generation, discrete event simulation, statistical analysis of simulated data and validation techniques.

*Prerequisite:* Consult the instructor concerning necessary background for this course

## STA2542H – Linear Models

This is an advanced graduate course. The emphasis is on linear mixed models and generalized mixed models. Inference requires numerical optimization methods Newton-Raphson and EM algorithms) as well as Monte Carlo sampling methods (importance sampling, accept-reject, Metropolis-Hastings, Gibbs) and these will be taught in class.

*Prerequisite:* Strong background in statistics is required.

## STA2555H – Foundations & Trends in Casual Inference

In this course we will study techniques and algorithms for creating effective data visualizations based on principles from graphic design, visual art, perceptual psychology, and cognitive science.This course is targeted both towards students interested in using visualization in their own work, as well as students interested in building better visualization tools and systems.

## STA2600H – Teaching Statistics

This course provides an introduction to a scholarly approach to teaching statistics in higher education. Emphasis is placed on the use of statistics education research, effective communication of fundamental statistical concepts typically encountered in introductory statistics, alignment of learning outcomes, course activities and assessments, recognition of common misconceptions and how to address them, and effective integration of educational and statistical technologies. No prior teaching experience is necessary.

## STA2700H – Comput Inference & Graphical Models

This is a reading course primarily meant to sequentially follow a modular course offered in the Department. Its purpose is to offer further supervised study of an advanced topic covered for the ambitious student.

## STA3000Y – Advanced Theory of Statistics

**Please note that STA3000Y F & S can only be taken by PhD students in the Department of Statistical Sciences. **

This is the Department’s core graduate course in statistical theory. It covers the basic principles of statistical inference, their application to a variety of statistical models, and some generalizations to more complex settings.

*Prerequisite: *

- STA2112H and STA2212H or equivalent. (STA2111H and STA2211H may be co-requisites).
- Some familiarity with measure theory is very useful. The text includes some supplementary material on this.

## STA3431H – Monte Carlo Methods

This course will explore Monte Carlo computer algorithms, which use randomness to perform difficult high-dimensional computations. Different types of algorithms, theoretical issues, and practical applications will all be considered. Particular emphasis will be placed on Markov chain Monte Carlo (MCMC) methods. The course will involve a combination of methodological investigations, mathematical analysis, and computer programming.

*Prerequisite:* Knowledge of statistical inference and probability theory at the advanced undergraduate level, and familiarity with basic computer programming techniques.

## STA4002H - Advance Special Topics: Communication and Dissemination in Statistical Sciences

This graduate course is designed to improve verbal and written communication skills of PhD students in statistical sciences. The students are expected to attend regularly (at least 80%) the weekly research seminars of the Department of Statistical Sciences, other relevant seminars such as the Fields Distinguished Lecture Series in Statistical Sciences, as well as special seminars or workshops organized by the department on communication skills. The students are expected to select two research talks and submit a written report on each of them. The reports are supposed to be concise (2-3 pages; 1000-1500 words), summarizing the talks, and critiquing the research and/or presentations if possible. Based on their reports, the students in the class are expected to produce one 15-minute presentation for the class. The presentation can either discuss both reports or only one of the reports in more detail.

*Evaluation:*

- participation of the departmental seminars and special seminars on communication and dissemination (40%)
- two short (2-3 pages) written reports on (self-selected) research talks of personal interest (30%)
- 15-minute presentation of the reports (30%).

## STA4246H – Research Topics in Mathematical Finance

This course focuses on advanced theory and modeling of financial derivatives. The topics include, but are not limited to: HJM interest rate models, LFM and LSM market models; foreign exchange options; defaultable bonds; credit default swaps, equity default swaps and collateralized debt obligations; intensity and structural based models; jump processes and stochastic volatility; commodity models. As well, students are required to complete a project, write a report and present a topic of current research interest.

*Prerequisite: *STA 2503 or equivalent knowledge.

## STA4247H – Point Processes, Noise and Stochastic Analysis

Introduction to the theory of point processes – Poisson and compound processes, point provesses with repulsion and attraction. Brownian motion, white noise. Stochastic intergration and stochastic differential equtions.

*Prerequisite: *Consult the instructor concerning necessary background for this course

## STA4273HS/CSC2547HS – Topics Stats Machine Learning

This is a full semester course.

Recently, new inference methods have allowed us to train learn generative latent-variable models. These models let us generate novel images and text, find meaningful latent representations of data, take advantage of large unlabeled datasets, and even let us do analogical reasoning automatically. However, most generative models such as GANs and variational autoencoders currently have pre-specified model structure, and represent data using fixed-dimensional continuous vectors. This seminar course will develop extensions to these approaches to learn model structure, and represent data using mixed discrete and continuous data structures such as lists of vectors, graphs, or even programs. The class will have a major project component.

*Prerequisites:* This course is designed to bring students to the current state of the art, so that ideally, their course projects can make a novel contribution. A previous course in machine learning such as CSC321, CSC411, CSC412, STA414, or ECE521 is strongly recommended. However, the only hard requirements are linear algebra, basic multivariate calculus, basics of working with probability, and basic programming skills.

*More information:* Course site

## STA4412H – Topics in Theoretical Statistics Modular Courses

This course will introduce students to the topics under discussion during the thematic program on Statistical Inference in Big Data, with a mix of background lectures and guest lectures. The goal is to prepare students, postdoctoral fellows, and other interested participants to benefit from upcoming workshops in the thematic program, and to provide a venue for further discussion of keynote presentations after the workshops.

These courses will be streamed using FieldsLive, and students are welcome to attend online. Students interested in obtaining credit for these courses need to arrange with their home department to have them approved as reading or research courses. We will make the timetable and requirements for the course available by September 2014.

## STA4500H – Statistical Dependence: Copula Models and Beyond

The course discusses modern developments in modeling statistical dependence. Emphasis will be placed on copula models, particularly on conditional copula models that can be used in regression settings.

Tentative topics include:

- Random Effects
- Copula Models for Continuous Data
- Dependence measures
- Types of Dependence
- Conditional copulas for Continuous Data
- Copula Models for Discrete/Mixed Data
- Conditional Copula Models for Discrete/Mixed Data
- Vines

*Course credit:* 0.25 FCE

## STA4501H – Functional Data Analysis and Related Topics

Functional data analysis (FDA) has received substantial attention in recent years, with applications arising from various disciplines, such as engineering, public health, finance etc. In general, the FDA approaches focus on nonparametric underlying models that often assume the data are observed from realizations of stochastic processes with smooth trajectories. This course will cover general issues in functional data analysis, such as functional principal component analysis, functional regression models, curve clustering and classification. An introduction to smoothing methods will also be included at the beginning of class to provide a basic view of nonparametric regression (kernel and spline types) and serve as the basis of FDA approaches. The course will involve some computing and data analysis using R or matlab.

*Course credit:* 0.25 FCE

## STA4502H – Topics in Stochastic Processes

This course will focus on convergence rates and other mathematical properties of Markov chains on both discrete and general state spaces. Specific methods to be covered will include coupling, minorization conditions, spectral analysis, and more. Applications will be made to card shuffling and to MCMC algorithms.

*Course credit:* 0.25 FCE

## STA4503H – Advanced Monte Carlo Methods and Applications

This course will examine how advanced Monte Carlo methods can be applied to problems in statistical inference. Methods discussed may include use of auxiliary variables, use of Hamiltonian dynamics for Markov chain Monte Carlo updates, use of tempering or annealing to handle multimodal distributions and estimate normalizing constants, and ways of exploiting parallel computation. Practical issues such as verifying that a method has been implemented correctly, assessing convergence of Markov chain methods, assessing the error in estimates, and tuning the parameters of methods will also be discussed. Assignments will involve both the use of available software packages for Monte Carlo estimation and programming of custom methods for particular statistical problems.

*Course credit:* 0.25 FCE

## STA4504H – An Introduction to Bootstrap Methods

The course gives an introduction to some modern methods of nonparametric inference with special emphasis on bootstrap methods. Through a series of data analysis problems involving the bootstrap students are exposed to methods for density estimation, robust and flexible regression. Many fundamental concepts in mathematical statistics are revisited and viewed from the lens of bootstrap simulation providing an important experimental perspective. The course is computationally intensive and requires knowledge of the programming environment R, or some equivalent language. A dominant theme of the course is the expansion of the toolbox of the statistical practitioner through the use of computation in technically complex problems.

*Course credit:* 0.25 FCE

## STA4505H – Applied Stochastic Control: High Frequency and Algorithmic Trading

With the availability of high frequency financial data, new areas of research in stochastic modeling and stochastic control have opened up. This 6 week course will introduce students to the basic concepts, questions and methods that arise in this domain. We will begin with the classical market microstructure models, understand different theories of price formation and price discovery, identify different types of market participants, and then move on to reduced form models. Next, we will investigate some of the typical algorithmic trading strategies employed in industry for different asset classes. Finally, we will develop stochastic optimal control problems for solving optimal liquidation and high frequency market making problems and demonstrate how to solve those problems using the principles of dynamic programming leading to Hamilton-Jacobi-Bellman equations. Students will also have a chance to work with historical limit order book data, develop Monte Carlo simulations and gain a working knowledge of the models and methods.

Tentative topics include:

- Market Microstructure
- Overview of Stochastic Calculus
- Dynamic Programming & HJB -Dynamics of LOB -Optimal Liquidation
- Market Making
- Risk Measures

*Course credit:* 0.25 FCE

## STA4506H – Non-stationary Time Series Analysis

The course will cover modeling, estimation and inference of non-stationary time series. In particular, we will deal with statistical inference of trends, quantile curves, time-varying spectra and functional linear models related to non-stationary time series. With the recent advances in various fields, a systematic account of non-stationary time series analysis is needed.

*Course credit:* 0.25 FCE

## STA4507H – Extreme Value Theory and Applications

Modeling the behaviour of extreme values is important in a variety of disciplines, from finance to environmental science, since catastrophes almost inevitably arise from extreme conditions. This course will cover both theoretical and applied aspects of extreme value modeling. Some of the topics to be covered are: extreme value types, point process methodology, the Hill and other estimators of the tail index, estimating extreme quantiles, multivariate extremes, estimators of tail dependence.

*Course credit:* 0.25 FCE

## STA4508H – Topics in Likelihood Inference

Inference based on the likelihood function has a prominent role in both theoretical and applied statistics. This course will introduce some of the more recent developments in likelihood-based inference, with an emphasis on adaptations developed for models with complex structure or large numbers of nuisance parameters. Special emphasis will be given to the theoretical and applied aspects of composite likelihood, and to the use of quasi-likelihood and generalized estimating equations. Tentative topics to be covered include: review of likelihood inference and asymptotic results; adjustments to profile likelihood; misspecified models — composition likelihood; partially specified models — quasi-likelihood; properties and limitations of penalized likelihood.

*Course credit:* 0.25 FCE

## STA4509H – Insurance Risk Models I

The aim of this course is to provide an introduction to advanced insurance risk theory. This course covers frequent and severity models, aggregate losses and compound distributions, EM algorithm, Model selection and estimation.

*Course credit:* 0.25 FCE

## STA4510H – Insurance Risk Models II

This course covers topics in ruin theory, including the classical compound Poisson risk model, ruin probabilities, surplus prior to ruin and deficit at ruin, connection to queuing models, Gerber-Shiu discounted penalty function, associated integro-differential equation and defective renewal equation, risk models with dividend barrier and dividend strategies.

*Course credit:* 0.25 FCE

## STA4511H – Statistical Issues in Number Theory

We will provide a broad overview of selected areas in analytic number theory (mainly without proofs) leading to a variety of problems and questions involving probability and statistics. The statistical and probabilistic properties of the family of zeta distributions will be discussed in detail. Known and conjectured results about the distribution of (high) primes will be discussed with reference to testing poisson-ness, and applications to testing pseudorandom number generators. The probabilistic heuristic introduced by Cramer (as well as its limitations) will be explored, using such results as the `zero-one laws’ and law of the iterated logarithm. If there is time, statistical properties and modeling of the high zeros of the zeta function will be considered in conjunction with time series and point processes methods.

*Course credit:* 0.25 FCE

## STA4512H – Logical Foundations of Statistical Inference

The general mathematics and logical foundations for statistical inference: geometric, algebraic and topological symmetries that arise naturally in the solution to the inference problem, including rigorous comparison of the bayesian and frequentist approaches, and the group theoretic considerations of invariance (algebraic and logical symmetry), both on the sample space as well as on the parameter space (and both either implicit or manifest) that must be taken into account in the analysis. Unusual for the development, but fundamental to the inherent logic of such considerations, the finite-finite case is given special attention in respect of both sample space and parameter space.

*Course credit:* 0.25 FCE

## STA4513H -Statistical Models of Networks, Graphs, and Other Relational Structures

Our understanding of graph- and network-valued data has undergone a dramatic shift in the past decade. We now understand there to be fundamentally different regimes that relate to the prevalence of edges. The best understood is the dense regime, where, informally speaking, we expect to see edges among vertices chosen uniformly at random from a large graph. The mathematical foundations of this area can be traced back to work by Aldous and Hoover in the early 1980s, but work in graph theory over the past decade has enriched our understanding considerably. Most existing statistical methods, especially Bayesian ones, work implicitly in the dense regime. Real-world networks, however, are not dense. A growing community is now focused on the structure of large sparse graphs. The sparse regime, however, is not well understood: key mathematical notions continue to be identified. We will work through key papers in probability, statistics, and graph theory in order to gain the broader perspective necessary to identify opportunities to contribute to our understanding of statistical methods on graphs and networks.

## STA4514H – Modelling and Analysis of Spatially Correlated Data

This is an advanced course in models and methods for spatial data, with an emphasis on data which are not normally distributed. The course will cover different types of random spatial processes and how to incorporate them into mixed effects models for normal and non-Normal data, with maximum likelihood and Bayesian inference used for the two types of data respectively. Spatial point processes, where dare are random locations rather than measurements at fixed locations, will be dealt with extensively. Following the course, students will be able to undertake a variety of analyses on spatially dependent data, understand which methods are appropriate for various research questions, and interpret and convey results in the light of the original questions posed.

*Course credit:* 0.25 FCE

## STA4515H – Multiple Hypothesis Testing and its Applications

A central issue in many current large-scale scientific studies is how to assess statistical significance while taking into account the inherent multiple hypothesis testing issue. This graduate course will provide an in-depth understanding of the topic in the context of data science with a focus on statistical `omics’. We start with an insightful revisit of single hypothesis testing, the building block of multiple hypothesis testing. We then study the fundamental elements of multiple hypothesis testing, including the control of family-wise error rate and false discovery rate. We will also touch upon various more advanced topics such as data integration, selective inference and fallacy of p-values. The course will provide both analytical arguments and empirical evidence.

Students are evaluated based on class participation and one final research report on a suggested or self-selected project related to multiple hypothesis testing.

*Course credit:* 0.25 FCENew!

## STA4516H – Nonstandard Analysis and Applications to Statistics and Probability

Basic concepts in nonstandard analysis, including infinitesimal and infinite numbers, and descriptions of basic concepts like continuity and integration in terms of these notions. Advanced topics, including Loeb measure theory. Applications to stochastic processes and statistics.

*Course credit:* 0.25 FCENew!

## STA4517H – Information Visualization

This course introduces the research area of causal inference in the intersection of statistics, social science and artificial intelligence. A central theme of this course will be that without a formal theory of causation, intuition alone can be misleading for drawing causal conclusions. Topics include: potential outcomes and counterfactuals, measures of treatment effects, causal graphical models, confounding adjustment, instrumental variables, principal stratification, mediation and interference. Concepts will be illustrated with applications in a wide range of subjects, such as computer science, social science and biomedical data science.

## STA4518H - Robust Statistical Methods

This course will give an overview of robust statistical methods, that is, methods that are insensitive to outliers or other data contamination. Topics will include theoretical notions such as qualitative robustness and breakdown point, robust estimation of location (minimax variance and bias) and scale parameters, robust estimation in regression and multivariate analysis, and applications (including in computer vision).

*Prerequisite: *

- STA2112H
- permission

*Course credit:* 0.25 FCE

## STA4519H - Optimal Transport: Theory & Algorithms

Optimal transport is a vast subject and has deep connections with analysis, probability and geometry. In recent years optimal transport has found widespread applications in data science (a notable example is the Wasserstein GAN). In this course we offer a balanced treatment featuring both the theory and applications of the subject. After laying down the theoretical foundation including the Kantorovich duality, we turn to numerical methods and their applications to data science. Possible topics include entropic regularization, dynamic formulations, gradient flows, statistical divergences and the W-GAN. Our main reference is the recent book Computational Optimal Transport by Gabriel Peyré and Marco Cuturi.

*Prerequisite: *

- STA2111H – Graduate Probability I
- STA2211H – Graduate Probability II
- (or permission by the instructor)

*Course credit:* 0.25 FCE

## STA4522H – The Measurement of Statistical Evidence

The concept of statistical evidence is central to the field of statistics. In spite of many references to “the evidence” in statistical applications, it is fair to say that there is no definition of this that achieves broad support in the sense of serving as the core of a theory of statistics. The course will examine the various attempts made to measure evidence in the statistical literature and why these are not entirely satisfactory. A proposal to base the theory of statistical inference on a particular measure, the relative belief ratio, is discussed and how this fits into a general theory of statistical reasoning.

## STA4525H – Demographic Methods

This course provides an overview of the core areas of demography (fertility, mortality and migration) and the techniques to model such processes.

The course will cover life table analysis, measures of fertility and nuptiality, mortality and migration models, and statistical methods commonly used in demography, such as Poisson regression, survival analysis, and Bayesian hierarchical models.

The goal of the course is to equip students with a range of demographic techniques to use in their own research.

## STA4526H - Stochastic Control & Applications in Finance

The course will introduce students to the basic theory of stochastic optimal control. We will cover both the analytic approach, including an introduction to viscosity solution theory, and the probabilistic approach which is based on BSDE and the stochastic maximum principle. Applications to portfolio optimization and contract theory will be discussed. Prerequisite to this course include (measure-theoretic) probability theory and stochastic calculus.

*Prerequisite: *(measure-theoretic) probability theory and stochastic calculus

*Course credit:* 0.25 FCE

## STA4527H - Random Matrix Theory & Its Applications

Random matrix theory is now a big subject with applications in many disciplines of science, engineering and statistics. This course will cover fundamental concepts, principal and theory in random matrix theory, orienting towards the needs and interests in statistics. Applications to big data analytics and geometric data analysis are provided.

*Course credit: 0.5 FCE*

## Master’s Research Project Course

A limited number of Supervised Research Project courses, normally taken as half-courses, will also be made available, based on faculty availability. These courses will provide students with a first exposure to research-level topics and thinking. Students will normally be required to write a substantial report about their work, plus perhaps give a brief oral presentation. Projects may be proposed either by faculty or by students; information about faculty-proposed projects will be provided in September.

To enroll in such a course, a student must first obtain permission from the supervising faculty member and from the Associate Chair, Graduate Studies. There is no guarantee that enrollment can be provided for all interested students. For further details, please consult the Associate Chair, Graduate Studies.