Publications: Statistical Theory & Methodology

A generalized Levene's scale test for variance heterogeneity in the presence of sample correlation and group uncertainty

by David Soave and Lei Sun

Biometrics | 2017 | 73(3):960-971

Short Summary: Why were we interested in generalizing Levene's test? It can be used to indirectly detect Gene-Environment interactions when E is missing!

Read more

 


 

A new look at F-tests

by McCormack, A., Reid, N., Sartori, N. and Theivendran, S.-A.

Short Summary: We show that the directional tests recently developed by Fraser, Reid, Sartori and Davison can be explicitly computed in a number of classical models, including normal theory linear regression, where the test reduces to the usual F-test.

Read more

 


 

Adaptive Huber regression

by Qiang Sun, Wen-Xin Zhou, and Jianqing Fan

Journal of the American Statistical Association | 2018

Short Summary: We proposed the concept of tail-robustness, which is evidenced by better finite-sample performance than nonrobust methods in the presence of heavy-tailed data. To achieve this form of robustness, we proposed the adaptive Huber regression. The key difference between this and its classical counterpart, Huber regression, is that the robustification parameter needs to adapt to the sample size, dimensionality and unknown moments of the data, so that an optimal tradeoff between the effect of heavy-tailedness and statistical bias can be achieved.

Read more

 


 

Data-dependent PAC-Bayes priors via differential privacy

by G. K. Dziugaite, D. M. Roy

Advances in Neural Information Processing Systems | 2018 (to appear)

Short Summary: The Probably Approximately Correct (PAC) Bayes framework (McAllester, 1999) can incorporate knowledge about the learning algorithm and data distribution through the use of distribution-dependent priors, yielding tighter generalization bounds on data-dependent posteriors. Using this flexibility, however, is difficult, especially when the data distribution is presumed to be unknown. We show how a differentially private prior yields a valid PAC-Bayes bound, and then show how non-private mechanisms for choosing priors obtain the same generalization bound provided they converge weakly to the private mechanism.

Read more

 


 

Distributed inference for quantile regression processes

by Stanislav Volgushev, Shih-Kang Chao and Guang Cheng

Annals of Statistics | 2018 (to appear)

Short Summary: We provide novel approaches to do quanitle regression for Big (massive) data and show one of the first examples where the failure of a popular computational approach - divide and cnquer, can be characterized explicitly. Thepaper also provides new approaches to inference that explicitly use thedivide and conquer framerowrk for fast and simple inference.

Read more

 


 

I-LAMM for sparse learning: Simultaneous control of algorithmic complexity and statistical error

by Jianqing Fan, Han Liu, Qiang Sun, and Tong Zhang

The Annals of Statistics | 2018 | 46(2), 814-841

Short Summary: Nonconvex optimization has attracted much interest recently in both statistics and machine learning. This is possibly due to the popularity of big data, which enables the use of complex and nonconvex learning tools in practice. This paper shows that, by taking model structures and randomness into account, finding the global optima with a polynomial-time algorithm in nonconvex problems becomes possible, at least in the problem of nonconvex sparse regression. We propose such an algorithm and characterize its statsitical and computaitonal tradeoffs.

Read more

 


 

Quantile spectral analysis for locally stationary time series

by Stefan Birr, Stanislav Volgushev, Tobias Kley, Holger Dette and Marc Hallin

Journal of the Royal Statistical Society: Series B | 2017 | Vol 79, PP 1619-1643

Short Summary: We develop new methods for time series analysis that allow to describe non-linear dynamics for non-stationary processes and show that many models that are routinely applied to study time series are not able to capture the true dynamics of observed data.

Read more

 


 

Sampling and Estimation for (Sparse) Exchangeable Graphs

by V. Veitch, D. M. Roy

Annals of Statistics | 2016 (to appear)

Short Summary: We develop the graphex framework (Veitch and Roy, 2015) as a tool for statistical network analysis by identifying the sampling scheme that is naturally associated with the models of the framework, and by introducing a general consistent estimator for the parameter (the graphex) underlying these models. Our results may be viewed as a generalization of consistent estimation via the empirical graphon from the dense graph regime to also include sparse graphs.

Read more

 


 

Vine Copulas for Imputation of Monotone Non-Response

by Caren Hasler, Radu V. Craiu and Louis-Paul Rivest

International Statistical Review | 2018 | 86: 488-511

Short Summary: Multiple imputations sor sample surveys from copula models.

Read more

 


 

When should modes of inference disagree? Some simple but challenging examples.

by Fraser, D.A.S., Reid, N. and Lin, W.

Annals of Applied Statistics | 2018 | 12, 750--770

Short Summary: This paper addresses eight illustrative problems that David Cox outlined for a recent conference. Each illustration raises difficulties for different theoretical approaches to inference. We discuss these from the view of our work on high order asymptotics.

Read more