# Student Publication Highlights

Chair’s Note on Student Publication Highlights

I am extremely pleased and honoured to share some student publication highlights with you. This initiative is meant to showcase the output of high-level research from our graduate students. Those who will browse the papers in this collection will note that the range of topics covers a wide area in applied, theoretical and computational statistics.  This breadth is not an accident, but rather a reflection of our Department’s ongoing efforts to build interdisciplinary bridges while making seminal contributions to the core methodology of Statistics. The Department’s intellectual environment shines through each of these works and demonstrates the outstanding quality of our students. Those you see published here are representative samples of our graduate ’family” that grows every year, with future leaders in academia, industry, and public service.

We dedicate this Student Publication Highlight to the memory of Professor Donald A. S. Fraser, OC, FRSC (1925-2020), a giant of Statistics,  the first Chair of our Department and an inspiration to those many graduate students and colleagues who had the chance to know him.

Ali completed his PhD in mathematical finance in October 2020 under the supervision of Professor Sebastian Jaimungal. His main research interests are in stochastic portfolio theory, stochastic optimal control and portfolio selection problems. Read more.

By Ali Al-Aradi and Sebastian Jaimungal

Active and Passive Portfolio Management with Latent Factors

Quantitative Finance | 2021 (accepted)

Abstract: We address a portfolio selection problem that combines active (outperformance) and passive (tracking) objectives using techniques from convex analysis. We assume a general semimartingale market model where the assets' growth rate processes are driven by a latent factor. Using techniques from convex analysis we obtain a closed-form solution for the optimal portfolio and provide a theorem establishing its uniqueness. The motivation for incorporating latent factors is to achieve improved growth rate estimation, an otherwise notoriously difficult task. To this end, we focus on a model where growth rates are driven by an unobservable Markov chain. The solution in this case requires a filtering step to obtain posterior probabilities for the state of the Markov chain from asset price information, which are subsequently used to find the optimal allocation. We show the optimal strategy is the posterior average of the optimal strategies the investor would have held in each state assuming the Markov chain remains in that state. Finally, we implement a number of historical backtests to demonstrate the performance of the optimal portfolio.

Layman Summary: If you asked a prospective investor what they hope to achieve by investing, the most common (and obvious) answer would be "to make money!" The goal of making as much money as possible or enough money to pay off a mortgage within a certain timeframe, for example, can be sensible goals. However, they are goals that are absolute in nature - meaning they do not involve any external random factors that determine whether or not a certain outcome is deemed a success. An alternative (and also very common) goal would be to "beat the market." A goal of this nature would be considered a relative goal because, regardless of how well an investment strategy does, the investor would be disappointed if their strategy did not make them as much as a passive investment in the market. In this day and age, the average retail investor is savvy enough to be able to invest their money easily in a broad market index and gain whatever returns the market generates. As such, they would expect a professional money manager to be able to outperform that benchmark. Benchmarks also serve as an expression of the investor's risk tolerance, in that they do not wish to invest in a way that subjects them to a level of volatility that is vastly different than that of the benchmark. In other words, they may be opposed to deviating too much from the market - a notion that is commonly referred to as tracking. In short, the combination of all these facets of the problem illustrates that performance can be tied to an external process (the benchmark) that evolves randomly. The main goal of this paper is to be able to solve a wide range of problems that fall under both of these categories (absolute and relative) and may involve elements of outperformance (active investing) and tracking (passive investing).

Successfully solving this problem ultimately hinges on our ability to estimate the rate at which asset prices grow - which can itself vary through time as assets go through different phases of booms and busts. The approach that we take is to use so-called "latent factors" to model asset prices. Intuitively, this means that we assume that an unobservable variable - a "hidden truth", so to speak - dictates how assets grow through time. Often this variable is interpreted as the state of the economy. Under some states, all assets perform well; under others, all assets perform poorly; and there are certain states still that favour some assets over others. The mathematical complexity here lies in the fact that we must figure out which states are good/bad for which assets and that the state of the economy is not something we can observe; instead, it must be inferred from what we can observe, namely the asset prices themselves. Despite the added complexity, using latent factors provides us with the flexibility to better capture the behavior of asset prices and ultimately leads to better investment results.

### Bilodeau, Blair

Blair is a third-year PhD candidate in statistical sciences at the University of Toronto, supervised by Professor Daniel Roy. His research is supported by an NSERC Doctoral Canada Graduate Scholarship and the Vector Institute. His research focuses on combining techniques from statistics and computer science to obtain theoretical performance guarantees for decision making. Read more.

By Blair Bilodeau, Dylan J. Foster, and Daniel M. Roy

In Proceedings of the 37th International Conference on Machine Learning | 2020

Abstract: We consider the classical problem of sequential probability assignment under logarithmic loss while competing against an arbitrary, potentially nonparametric class of experts. We obtain tight bounds on the minimax regret via a new approach that exploits the self-concordance property of the logarithmic loss. We show that for any expert class with (sequential) metric entropy $O(\gamma^{-p})$ at scale $\gamma$, the minimax regret is $O(n^{\frac{p}{p+1}})$, and that this rate cannot be improved without additional assumptions on the expert class under consideration. As an application of our techniques, we resolve the minimax regret for nonparametric Lipschitz classes of experts.

Layman Summary: We study the general problem of probabilistic forecasting: the forecaster is tasked with specifying probabilities over the possible outcomes of an upcoming event (e.g., the probability of rain; or of cardiac failure; or that an adjacent driver will change lanes). Probabilistic forecasts are scored based on how much probability they assign to the actual observed outcome. We suppose that, in addition to covariate information (e.g., current weather; or patient medical history; or the car's sensor readings), the forecaster has access to a class of benchmark forecasters, which we refer to as experts. These experts may exist in the real world in the form of domain-specific opinions, or may simply be tools for the forecaster, such as a collection of statistical forecasting models. Irrespective of their form, the experts' predictions are available to the forecaster to use when producing their own forecast.

The goal of the forecaster is to perform nearly as well as the expert who ends up performing best once all the outcomes have been recorded. The goal of our research is to quantify how well the best forecaster can do at this task. One way to understand the best forecaster’s performance is to assume that the data will follow some statistical relationship, but this exposes the forecaster to potentially poor performance if reality diverges from these assumptions. In contrast, we characterize the limits of forecasting performance without making any assumptions on the data. Specifically, we show that a particular geometric notion of the “complexity” of a class of experts characterizes the performance limits for very general and large classes.

Our work partially resolves a decades-long line of inquiry by showing that a specific measure of complexity can precisely characterize probabilistic forecasting performance for general expert classes simultaneously. Additionally, an important implication of our results is that probabilistic forecasting is fundamentally different from the related problem of regression (i.e., forecasting the average event), since different complexity measures must be used to characterize the limits of performance for these two tasks. Finally, our results are stated abstractly in terms of the experts, and consequently can be broadly applied across domains where forecasting future events must account for multiple sources of uncertainty.

### Casgrain, Philippe

Philippe Casgrain is currently a Postdoctoral researcher appointed jointly between ETH Zürich and Princeton University. Prior to this position, Philippe was a quantitative researcher in algorithmic execution at Citadel Asset Management in New York City. He completed his PhD in the Department of Statistical Sciences at the University of Toronto in 2018, under the supervision of Sebastian Jaimungal, where his thesis was focused integration of Mean-Field Games and Machine Learning for Algorithmic Trading. His main areas of research include Algorithmic Trading, Stochastic Control, Optimization and their intersection.

By Philippe Casgrain and Sebastian Jaimungal

Mathematical Finance 30.3 | 2020

Abstract: Even when confronted with the same data, agents often disagree on a model of the real-world. Here, we address the question of how interacting heterogenous agents, who disagree on what model the real-world follows, optimize their trading actions. The market has latent factors that drive prices, and agents account for the permanent impact they have on prices. This leads to a large stochastic game, where each agents' performance criteria are computed under a different probability measure. We analyse the mean-field game (MFG) limit of the stochastic game and show that the Nash equilibrium is given by the solution to a non-standard vector-valued forward-backward stochastic differential equation. Under some mild assumptions, we construct the solution in terms of expectations of the filtered states. Furthermore, we prove the MFG strategy forms an ϵ-Nash equilibrium for the finite player game. Lastly, we present a least-squares Monte Carlo based algorithm for computing the equilibria and show through simulations that increasing disagreement may increase price volatility and trading activity.

Layman Summary: This paper considers the problem of optimally trading an asset in a competitive market involving many agents with imperfect information who may have disagreeing views on how the market evolves over time. In order to be robust to the actions of others agents we identify strategies which form a Nash Equilibrium in this setting, where agents attempt maximize their future total book value, all while remaining conscious of price impact, transaction costs and risk exposure. Since the finite-agent version of this problem proves to be intractable, we consider the infinite-agent or 'mean-field' limit of this game. We show that Nash equilibria can be fully characterized in this regime and that these solutions also form approximate Nash equilibria when the number of agents is finite. Lastly, we test the resulting strategies on a variety of market simulations and show that disagreement empirically increases price volatility.

### Chen, Bo

Bo Chen graduated in 2019 with a PhD from the U of T Department of Statistical Sciences where he was jointly supervised by Professor Radu V. Craiu and Professor Lei Sun. His research focuses on statistical genetics, and his doctoral thesis title was "Statistical Methods for X-inclusive Genome-wide Association Study". He is currently a postdoctoral fellow at the Princess Margaret Cancer Centre.

By Bo Chen, Radu Craiu, and Lei Sun

Biostatistics | 2020

Abstract: X-chromosome is often excluded from the so-called “whole-genome” association studies due to the differences it exhibits between males and females. One particular analytical challenge is the unknown status of X-inactivation, where one of the two X-chromosome variants in females may be randomly selected to be silenced. In the absence of biological evidence in favor of one specific model, we consider a Bayesian model averaging framework that offers a principled way to account for the inherent model uncertainty, providing model averaging-based posterior density intervals and Bayes factors. We examine the inferential properties of the proposed methods via extensive simulation studies, and we apply the methods to a genetic association study of an intestinal disease occurring in about 20% of cystic fibrosis patients. Compared with the results previously reported assuming the presence of inactivation, we show that the proposed Bayesian methods provide more feature-rich quantities that are useful in practice.

Layman Summary: Genome-wide association studies have successfully identified many genes influencing heritable and complex traits (e.g., blood pressure, breast cancer) in the past decade. However, one major gap in the current ‘whole-genome’ association analysis is the routine omission of the X-chromosome, because the available statistical association methods, developed for analyzing genes from the autosomes, are not directly applicable to the X-chromosome.

In genetic association studies, genotype of a gene must be coded numerically so that its association with the trait of interest can be modelled statistically. For genes on the autosomes, there is only one correct coding scheme. However, a phenomenon called X-chromosome inactivation affects the genotype coding of a X-chromosomal gene for a female. Whether X-chromosome is inactivated or not is usually unknown, which leads to two possible statistical models, resulting in different association results.

In the presence of uncertainty about the correct model, we adopt a Bayesian approach that incorporates model uncertainty when building evidence for association. This leads to considering a “weighted average” model that combines the posterior distribution and Bayes factor estimate from each model, where the weight represents the support for each model by the data. We differentiate and adapt our procedure depending on whether the outcome of interest is continuous (e.g., blood pressure) or binary (e.g., breast cancer).

In order to evaluate the performance of our proposed model averaging method, we use numerical experiments involving genes with different XCI status and different levels of association strength.

Finally, we apply our method to an X-chromosome association study of meconium ileus, a binary intestinal disease occurring in about 20% of the individuals with Cystic Fibrosis. Method previously used in that study assumed all genes on the X-chromosome are inactivated. We show that the Bayesian Model Averaging method i) confirms the findings from the previous study, ii) pinpoints additional genes that merit follow-up studies, and iii) provides more feature-rich quantities that help investigate and interpret genetic association in practice.

### Chen, Sigeng

Sigeng is a second-year PhD student in statistics, supervised by Jeffrey Rosenthal. He is currently doing research about MCMC. Read more.

By J.S. Rosenthal, A. Dote, K. Dabiri, H. Tamura, S. Chen, and A. Sheikholeslami

Computational Statistics | 2021

Abstract: We consider versions of the Metropolis algorithm which avoid the inefficiency of rejections. We first illustrate that a natural Uniform Selection algorithm might not converge to the correct distribution. We then analyze the use of Markov jump chains which avoid successive repetitions of the same state. After exploring the properties of jump chains, we show how they can exploit parallelism in computer hardware to produce more efficient samples. We apply our results to the Metropolis algorithm, to Parallel Tempering, to a Bayesian model, to a two-dimensional ferromagnetic 4×4Ising model, and to a pseudo-marginal MCMC algorithm.

Layman Summary: The Metropolis algorithm is a method of designing a Markov chain which converges to a given target density π on a state space S. We consider versions of the Metropolis algorithm which avoid the inefficiency of rejections. We first illustrate that a natural Uniform Selection algorithm might not converge to the correct distribution. We then analyze the use of Markov jump chains which avoid successive repetitions of the same state. After exploring the properties of jump chains, we show how they can exploit parallelism in computer hardware to produce more efficient samples. We apply our results to the Metropolis algorithm, to Parallel Tempering, to a Bayesian model, to a two-dimensional ferromagnetic 4×4Ising model, and to a pseudo-marginal MCMC algorithm.

### Jia, Tianyi

Tianyi is a PhD student supervised by Professor Sebastian Jaimungal. His research studies algorithmic trading strategies of foreign exchange and equities under model misspecifications. Besides his PhD program, he is currently a senior manager at Enterprise Model Risk Management, GRM with Royal Bank Of Canada.

By Álvaro Cartea, Sebastian Jaimungal, and Tianyi Jia

SIAM J. Financial Mathematic | 2020 (accepted)

Abstract: We develop the optimal trading strategy for a foreign exchange (FX) broker who must liquidate a large position in an illiquid currency pair. To maximize revenues, the broker considers trading in a currency triplet which consists of the illiquid pair and two other liquid currency pairs. The liquid pairs in the triplet are chosen so that one of the pairs is redundant. The broker is risk-neutral and accounts for model ambiguity in the FX rates to make her strategy robust to model misspecification. When the broker is ambiguity neutral (averse) the trading strategy in each pair is independent (dependent) of the inventory in the other two pairs in the triplet. We employ simulations to illustrate how the robust strategies perform. For a range of ambiguity aversion parameters, we find the mean Profit and Loss (P&L) of the strategy increases and the standard deviation of the P&L decreases as ambiguity aversion increases.

Layman Summary: In the foreign exchange (FX) market, there are many currency pairs. Some of them are traded with large volumes, like the ones involving U.S. dollar (USD). Others are illiquid, such as the Australian and New Zealand dollar (AUD-NZD) pair. A FX broker needs to trade with her clients for their FX needs. Hence she needs to manage her FX exposures. Indeed, due to low clients' trading activities, our FX broker will accumulate non-zero positions of illiquid currency pairs easily. In addition, unlike the equity market, given three currencies, there are so-called "triangle relations" among their exchange rates. For example, when someone wants to convert Canadian dollar (CAD) to Euro (EUR), he or she can directly convert CAD to EUR, or convert CAD to USD first and then USD to EUR. Ideally, ignoring the transaction fees, the cost of CAD to gain 1 unit of EUR should be the same for both approaches. In our work, we develop trading strategies utilizing the triangle relations to manage our FX broker's inventories in the illiquid currency pairs. To the best of our knowledge our work is the first to show how FX brokers manage large positions in currency pairs.

### Lalancette, Michaël

Michaël is a PhD candidate in the Department of Statistical Sciences, University of Toronto since September 2017, where he is supervised by Professor Stanislav Volgushev. Prior to joining U of T, he completed a BSc in Mathematics and an MSc in Statistics in the Department of Mathematics and Statistics, Université de Montréal. Read more.

By Michaël Lalancette, Sebastian Engelke, Stanislav Volgushev

Annals of Statistics | 2021 (accepted)

Abstract: Multivariate extreme value theory is concerned with modelling the joint tail behavior of several random variables. Existing work mostly focuses on asymptotic dependence, where the probability of observing a large value in one of the variables is of the same order as observing a large value in all variables simultaneously. However, there is growing evidence that asymptotic independence is equally important in real-world applications. Available statistical methodology in the latter setting is scarce and not well understood theoretically. We revisit non-parametric estimation and introduce rank-based M-estimators for parametric models that simultaneously work under asymptotic dependence and asymptotic independence, without requiring prior knowledge on which of the two regimes applies. Asymptotic normality of the proposed estimators is established under weak regularity conditions. We further show how bivariate estimators can be leveraged to obtain parametric estimators in spatial tail models, and again provide a thorough theoretical justification for our approach.

Layman summary: Extreme value theory is a branch of statistics that seeks to extrapolate outside the range of available data in order to model extreme events and forecast their severity. For instance, in this paper we use a data set containing daily rainfall measurements at 92 different locations across southern Australia over a period of 50 years. One may wonder how severe the most extreme rainfall in the next 100 years will be at each location, but also how dependent those extremes are on each other, and whether they are likely to occur simultaneously.

To describe our main contributions, first consider a bivariate context where only two variables of interest (such as rainfall at two locations) are measured. Each data point now consists of two measurements. In classical extreme value theory, an observation is considered extreme if at least one of the two variables exceeds a high threshold. Only those observations are then used to model the extremal behavior. However, this approach can fail in many environmental applications, such as rainfall at two selected locations in the aforementioned example. The trouble is that some environmental data sets tend to exhibit a property termed tail independence: for most observations where one variable is extreme, the other is not. This poses difficulties for accurate modelling and can lead to a considerable underestimation of the probability that the two variables are extreme simultaneously. To remedy this issue, the usual solution when the variables of interest are believed to be tail independent relies on restricting the sample to observations where both variables are simultaneously large. In summary, available methods for modelling data with extremal dependence and independence differ considerably, and there exists no approach that works in both scenarios.

In this paper, we offer new insights on bivariate extreme value theory under tail dependence and independence. We introduce a way to model bivariate distributions based on the simultaneous behavior of their extremes without prior knowledge of whether tail dependence or independence is appropriate in a particular application. The main practical advantage of our approach over existing ones is that it is agnostic to the presence of tail independence, making it applicable in a wide range of settings.

We further demonstrate that the results obtained in the bivariate setting are also useful in the spatial context where many locations are considered simultaneously. Here our method can be used to fit parametric models for spatial processes (in the earlier example, the spatial distribution of extreme rainfall over the whole region of interest). The strategy consists of considering a sufficiently large number of pairs of locations and treat each one as a bivariate data set. Our method may then be used to learn the extremal behavior of each pair, and we demonstrate how this knowledge can be leveraged to infer the extremal behavior of the entire spatial process.

Along the way, we provide new theoretical results about certain mathematical objects called empirical tail processes that form a fundamental building block for the analysis of most estimation procedures dealing with multivariate extremes.

### Levi, Evgeny

Dr. Evgeny Levi is a Model Risk Specialist at Bank of Montreal. He studied Mathematics and Statistics at The University of Toronto (BS 2009, MS 2013) and received a PhD from the Department of Statistical Sciences at The University of Toronto in 2019. His supervisor was Professor Radu Craiu. His main research interests are in computational methods in statistics, especially, Markov Chain Monte Carlo (MCMC) and Approximate Bayesian Computation algorithms (ABC), dependence models and model selection procedures for copula models.

During his PhD studies, he published four articles in Bayesian Analysis, Computational Statistics & Data Analysis and Journal of Computational and Graphical Statistics journals. He was a recipient of Andrews Academic Achievement Award, Teaching Assistant Award, Queen Elizabeth II Graduate Scholarship and Ontario Graduate Scholarship.

By Evgeny Levi and Radu Craiu

Bayesian Analysis | 2021 (accepted)

Abstract: With larger data at their disposal, scientists are emboldened to tackle complex questions that require sophisticated statistical models. It is not unusual for the latter to have likelihood functions that elude analytical formulations. Even under such adversity, when one can simulate from the sampling distribution, Bayesian analysis can be conducted using approximate methods such as Approximate Bayesian Computation (ABC) or Bayesian Synthetic Likelihood (BSL). A significant drawback of these methods is that the number of required simulations can be prohibitively large, thus severely limiting their scope. In this paper we design perturbed MCMC samplers that can be used within the ABC and BSL paradigms to significantly accelerate computation while maintaining control on computational efficiency. The proposed strategy relies on recycling samples from the chain’s past. The algorithmic design is supported by a theoretical analysis while practical performance is examined via a series of simulation examples and data analyses.

Layman Summary: Bayesian computation currently encounters two type of challenges. One concerns the issue of computationally expensive likelihoods that usually arise from large data. The second, which is relevant for this paper, occurs because modern statistical models are often complex enough to yield intractable likelihoods. Classical Bayesian inference relies entirely on the posterior distribution which, in turn, is explored using Markov chain Monte Carlo (MCMC) simulation methods. The MCMC sampling algorithms are iterative and require at each iteration the calculation of the likelihood. Clearly, when the likelihood is not available analytically, the usual MCMC samplers  required to study the posterior cannot be implemented. A remedy is proposed by the so-called approximate Bayesian computation (ABC) algorithms which rely on the ability to generate, given values for the model parameters, data from the model.  Such an assumption is satisfied, for instance, by model emulation experiments, and is met by models increasingly used in scientific domains such as Astronomy and Astrophysics, Hydrology and Genetics, to name a few. The algorithms are still iterative and each iteration requires to simulate data like the ones observed, so their success depends on the ability to do this hundreds or thousands of times. This paper considers new strategies to mitigate the cost incurred when the data generation is computationally expensive. The approach proposed relies on recycling samples, thus minimizing, at each iteration, the need to produce multiple new data replicates. We consider implementations in the case of ABC and another approximate Bayesian method, called Bayesian Synthetic Likelihood (BSL). By using past samples, we create another approximation of the algorithm that one would have used in the ABC/BSL sampling, This compels us to demonstrate theoretically that the approximation introduced does not prove too costly in terms of statistical efficiency for the resulting estimators. The numerical experiments show that this is indeed the case, while the computational costs are significantly  smaller.

By Evgeny Levi, and Radu V Craiu

In: La Rocca M., Liseo B., Salmaso L. (eds) Nonparametric Statistics. ISNPS 2018. Springer Proceedings in Mathematics & Statistics, vol 339. Springer, Cham.

Abstract: The paper considers the problem of establishing data support for the simplifying assumption (SA) in a bivariate conditional copula model. It is known that SA greatly simplifies the inference for a conditional copula model, but standard tools and methods for testing SA in a Bayesian setting tend to not provide reliable results. After splitting the observed data into training and test sets, the method proposed will use a flexible Bayesian model fit to the training data to define tests based on randomization and standard asymptotic theory. Its performance is studied using simulated data. The paper’s supplementary material also discusses theoretical justification for the method and implementations in alternative models of interest, e.g. Gaussian, Logistic and Quantile regressions.

Layman summary: Let us imagine a situation in which a couple of dependent response variables are measured simultaneously along with a number of covariates. Traditional regression models will specify functional connections between each response variable and the covariates. In addition to marginal effects, one may be interested in understanding the dependence structure between the variables and how it changes with covariates.

Copulas are distribution functions that bind, in a mathematically coherent way, continuous marginal distributions to form a multivariate distribution. Any copula will correspond to a specific dependence structure, independently from the marginal models, thus yielding rich classes of multivariate distributions that can capture a large variety of dependence patterns.

In a regression setting, the effect of covariates on the dependence between response variables is captured by a conditional copula model which allows the copula to evolve dynamically as covariates change. Estimation for conditional copula models can rarely be done in closed form, thus requiring numerical procedures for optimization or integration. This can increase the computational cost, especially when there is a cascade of estimations to be done, as in the case of vine-type factorizations. However, if each copula is constant, a condition known as the simplifying assumption (SA), then estimation is a lot simpler to perform since the parameter space of the model is significantly reduced. Various diagnostics have been proposed to identify whether the SA holds or not, but many suffer from lack of power to identify SA when it holds. In this paper we propose a permutation based diagnostic procedure that outperforms the competing methods in terms of identifying the correct structure.

### Shrivats, Arvind

Arvind completed his PhD in April 2021, under the supervision of Professor Sebastian Jaimungal. During his time in the Department of Statistical Sciences, Arvind was part of the Mathematical Finance Group, with his work supported by the Ontario Graduate Scholarship. His research uses tools from stochastic control and mean-field games to study emissions markets (such as cap-and-trade or renewable energy certificate markets).

By Arvind Shrivats and Sebastian Jaimungal

Applied Mathematical Finance | 2020

Abstract: SREC markets are a relatively novel market-based system to incentivize the production of energy from solar means. A regulator imposes a floor on the amount of energy each regulated firm must generate from solar power in a given period and provides them with certificates for each generated MWh. Firms offset these certificates against the floor and pay a penalty for any lacking certificates. Certificates are tradable assets, allowing firms to purchase/sell them freely. In this work, we formulate a stochastic control problem for generating and trading in SREC markets from a regulated firm’s perspective. We account for generation and trading costs, the impact both have on SREC prices, provide a characterization of the optimal strategy and develop a numerical algorithm to solve this control problem. Through numerical experiments, we explore how a firm who acts optimally behaves under various conditions. We find that an optimal firm’s generation and trading behaviour can be separated into various regimes, based on the marginal benefit of obtaining an additional SREC, and validate our theoretical characterization of the optimal strategy. We also conduct parameter sensitivity experiments.

Layman summary: In this work, we consider optimal behaviour within a sub-class of environmental markets known as Solar Renewable Energy Certificate (SREC) markets. SREC markets can be thought of as an inverse to the more common carbon cap-and-trade markets. They are designed to incentivize the production of energy from solar means. A regulator imposes a floor on the amount of energy each regulated firm must generate from solar power in a given period and provides them with certificates for each generated MWh. Firms offset these certificates against the floor and pay a penalty for any lacking certificates. Certificates are tradable assets, allowing firms to purchase/sell them freely.

Firms within these markets face obvious questions of how to navigate the market optimally. To that end, we propose a stochastic environment that the firms face that represents this market, and use techniques from stochastic control to find exactly that. In doing so, we account for numerous costs and real-life complexities that these firms face. In particular, this includes generation and trading costs, as well as possible price impacts that firms have within this market. This work was among the first to provide thorough and rigorous analysis of optimal  behaviour within SREC markets (and indeed, carbon cap-and-trade markets) on the firm level, while accounting for these complexities. Prior work often focused on social optimality from the perspective of a fictitious omniscient social planner who can control everyone's behaviour. Instead, this work is more realistic, as it focuses on how individual firms would behave when faced by an exogenous market, which is closer to what actually happens.

After characterizing the optimal strategy theoretically, we develop a numerical algorithm to solve for it, and conduct many experiments to explore the characteristics and market implications of optimal firm behaviour. We also conduct parameter sensitivity experiments.

### Slater, Justin

Justin is a second-year PhD candidate in statistical sciences at the University of Toronto, supervised by Patrick Brown and Jeff Rosenthal. His work is currently focused on Bayesian applications to modelling challenges in COVID-19. He is currently working on spatio-temporal models for COVID-19 that utilize population movement data, and Bayesian infection-fatality rate estimation in Canada. Read more.

By Justin J. Slater, Patrick E. Brown, and Jeffrey S. Rosenthal

Stat, Volume 10, Issue 1 | 2021

Abstract: As of October 2020, the death toll from the COVID‐19 pandemic has risen over 1.1 million deaths worldwide. Reliable estimates of mortality due to COVID‐19 are important to guide intervention strategies such as lockdowns and social distancing measures. In this paper, we develop a data‐driven model that accurately and consistently estimates COVID‐19 mortality at the regional level early in the epidemic, using only daily mortality counts as the input. We use a Bayesian hierarchical skew‐normal model with day‐of‐the‐week parameters to provide accurate projections of COVID‐19 mortality. We validate our projections by comparing our model to the projections made by the Institute for Health Metrics and Evaluation and highlight the importance of hierarchicalization and day‐of‐the‐week effect estimation.

Layman summary: As of October 2020, the death toll from the COVID‐19 pandemic has risen over 1.1 million deaths worldwide. Reliable estimates of mortality due to COVID‐19 are important to guide intervention strategies such as lockdowns and social distancing measures. We focused on mortality as opposed to case counts because case counts were mostly a function of how much testing was being done, while COVID-19 related death counts were more reliable because it was unlikely that someone would die of COVID-19 without being diagnosed. At the time of publication, one of the more popular websites used to track the pandemic was that of the Institute for Health Metrics and Evaluation (IHME). The model they were employing at the time was just a Normal distribution, which doesn’t really fit the shape of epidemic curves. Typically, there is exponential growth at first, followed by a slow decline in death counts as people start to distance more, lockdowns are implemented etc. The models that we were using in this paper (so called skew-normal models) captured this sharp increase and slow decline, and do a lot of other neat statistical things that allow for more accurate forecasts at the small-area level. Particularly, our model is good when a region’s death counts are low because our model implicitly borrows information from nearby regions. Additionally, our model accounted for day-of-the-week, allowing for more accurate day-to-day forecasts. The projections made by the IHME would be more pessimistic/optimistic depending on what day of the week you checked them.
In short, we compared our model to the IHME’s in the first wave of the epidemic and showed that our model forecasted COVID-19 mortality substantially better than the IHME. We showed that our model is fantastic at predicting first-wave mortality. Nearing the end of the project, many countries were entering their 2nd wave, and both our model and the IHME’s started performing worse. The IHME have since changed their methodology. Our model has since been extended to account for multiple waves. Modelling the COVID-19 pandemic is a constantly evolving challenge and models will need to continue to adapt in order to accurately capture the rise and fall of this deadly disease.

### Stringer, Alex

Alex is a PhD candidate in Statistics at the University of Toronto, supervised by Professors Patrick Brown and Jamie Stafford. He develops novel methods for fitting complex models to large datasets using Bayesian inference. In his doctoral work, he has introduced a novel class of models, Extended Latent Gaussian Models (ELGMs), and demonstrated their application in spatial epidemiology, astrophysics, and other areas. His work is funded by an NSERC Postgraduate Fellowship - Doctoral, and he has been awarded the Faculty of Arts and Science's Doctoral Excellence Scholarship. Read more.

By Alex Stringer, Patrick Brown, and Jamie Stafford

Biometrics | 2020

Abstract: A case‐crossover analysis is used as a simple but powerful tool for estimating the effect of short‐term environmental factors such as extreme temperatures or poor air quality on mortality. The environment on the day of each death is compared to the one or more “control days” in previous weeks, and higher levels of exposure on death days than control days provide evidence of an effect. Current state‐of‐the‐art methodology and software (integrated nested Laplace approximation [INLA]) cannot be used to fit the most flexible case‐crossover models to large datasets, because the likelihood for case‐crossover models cannot be expressed in a manner compatible with this methodology. In this paper, we develop a flexible and scalable modeling framework for case‐crossover models with linear and semiparametric effects which retains the flexibility and computational advantages of INLA. We apply our method to quantify nonlinear associations between mortality and extreme temperatures in India. An R package implementing our methods is publicly available.

Layman summary: We developed methods for estimating associations between mortality/morbidity (death/illness) risk and environmental factors like temperature or air pollution. Doing so involves comparing exposure levels for each subject on the date of death/illness and several previous days, and determining risk as a function of exposure from such data requires advanced statistical techniques. Our method accommodates very general types of exposure/risk associations and scales to large datasets, which are both of substantial practical concern in problems of this type. We applied the method to study association of temperature mortality in India using data from the Indian Million Deaths Study, a large and high-quality source of individual mortality data.

### Tang, Yanbo

Yanbo Tang is currently a fourth-year PhD student at the Department of Statistical Sciences under the joint supervision of Nancy Reid and Daniel Roy. His primary research interest lies in asymptotic theory and, more specifically in the performance of inferential and approximation procedures in high-dimensional problems. Read more.

By Yanbo Tang and Nancy Reid

Journal of the Royal Statistical Society: Series B | 2020

Abstract: We examine a higher order approximation to the significance function with increasing numbers of nuisance parameters, based on the normal approximation to an adjusted log‐likelihood root. We show that the rate of the correction for nuisance parameters is larger than the correction for non‐normality, when the parameter dimension p is O(n^α) for 𝛼<1/2 . We specialize the results to linear exponential families and location–scale families and illustrate these with simulations.

Layman summary: Although there are many predictive procedures in high dimensions whose performances are well characterized, such as LASSO and its many variants, the problem of inference in high dimensions is not as well studied. In this work we examine the use of the modified likelihood root for inference on a scalar parameter of interest when the number of nuisance parameters increases with the number of observations. The modified likelihood root can be expressed as the likelihood root statistic (a statistic obtained by taking the signed square root of likelihood ratio statistic) with an additional adjustment factor. The adjustment factor can be further broken down into an adjustment for the non-normality of the statistic and an adjustment for the influence of the nuisance parameters. We show that for general models, the adjustment for the nuisance parameters is larger than the non-normality adjustment. The above suggests that the main issue with the use of the unadjusted likelihood root in high dimensions is not a failure to converge to a normal limit, but rather the possible presence of a location or scale bias. We also present specialized results for the linear exponential and location scale families. Simulations studies are performed, demonstrating the normal approximation to the modified likelihood root is superior to the un-modified version in the logistic and Weibull regression models.

### Tseung, Chau Lung Ngan Spark

Chau Lung Ngan Spark Tseung is a third-year PhD student in the Department of Statistical Sciences, advised by Professors Andrei Badescu and Sheldon Lin. His research focus is in actuarial science. His academic goal is to apply novel machine learning and data science methods to solving actuarial problems such as pricing and claims reserving. Read more.

By Spark C. Tseung, Andrei L. Badescu, Tsz Chai Fung, and X. Sheldon Lin

Annals of Actuarial Science | 2021

Abstract: This paper introduces a new julia package, LRMoE, a statistical software tailor-made for actuarial applications, which allows actuarial researchers and practitioners to model and analyse insurance loss frequencies and severities using the Logit-weighted Reduced Mixture-of-Experts (LRMoE) model. LRMoE offers several new distinctive features which are motivated by various actuarial applications and mostly cannot be achieved using existing packages for mixture models. Key features include a wider coverage on frequency and severity distributions and their zero inflation, the flexibility to vary classes of distributions across components, parameter estimation under data censoring and truncation and a collection of insurance ratemaking and reserving functions. The package also provides several model evaluation and visualisation functions to help users easily analyse the performance of the fitted model and interpret the model in insurance contexts.

Layman summary: We build a statistical software package that allows actuaries to better model insurance losses. The package implements a theoretically flexible modelling framework, which is shown to outperform some classical tools currently used by practitioners. The package is written in julia, a novel programming language which is considerably faster than traditional languages designed for statistical analysis, facilitating the analysis of large datasets commonly encountered in practice.

### Zhang, Lin

Lin is a PhD candidate in the Department of Statistical Sciences at the University of Toronto, working with Professor Lei Sun. Her thesis focuses on developing statistical models for robust association testing with applications in genetics. She is also a trainee of the CANSSI-Ontario STAGE training program at the University of Toronto. Read more.

By Lin Zhang and Lei Sun

Biometrics | 2021