题目: Optimal Penalized Function-on-Function Regression under a Reproducing Kernel Hilbert Space Framework

摘要:Many scientific studies collect data where the response and predictor variables are both functions of time, location, or some other covariate. Understanding the relationship between these functional variables is a common goal in these studies. Motivated from two real-life examples, we propose a new function-on-function regression model that can be used to analyze such kind of functional data. Our estimator of the 2D coefficient function is the optimizer of a form of penalized least squares where the penalty enforces certain level of smoothness on the estimator. Our first result is the Representer theorem which states that the exact optimizer of the penalized least squares actually resides in a data-adaptive finite dimensional subspace although the optimization problem is defined on a function space of infinite dimensions. This theorem then allows us an easy incorporation of the Gaussian quadrature into the optimization of the penalized least squares, which can be carried out through standard numerical procedures. We also show that our estimator achieves the minimax convergence rate in mean prediction under the framework of function-on-function regression. Extensive simulation studies demonstrate the numerical advantages of our method over the existing ones. The proposed method is then applied to our motivating examples of the benchmark Canadian weather data and a histone regulation study.


题目:Weighted Leverage Score for Model-free Statistical Learning

摘要:In the past few decades, high dimensional data has occurred in areas such as genomics, tumor classification, image processing and Internet search. How to extract useful information from such data becomes the key issue nowadays. In high dimensional data, the“large p, small n”problem has posed many challenges to statistical analysis. Despite the urgent need in statistical tools to deal with such data, there are limited methods that can fully address the high dimensional problem. Motivated by sliced inverse regression and leverage score, we propose a novel feature screening method named weighted leverage score (WLS) under the framework of sufficient dimension reduction. Unlike linear stepwise regression, WLS screening procedure does not impose a specific form of relationship between the response variable and the predictors, and it can identify all relevant predictors consistently as demonstrated in our theoretical analysis. The WLS not only possesses consistency in selection, but also has competitive performance in empirical studies. We also applied the proposed method to a breast cancer data generated by spatial transcriptomics, and identified a group of marker genes of cancer stages.


题目:Integrating Model Uncertainty in Statistical Inference: A Bayesian Approach

摘要:In practice, when faced with multiple candidate models, the popular approach nowadays is to perform model selection using criteria such as AIC, BIC, MSE, and march on to make the statistical inference solely based on the selected model as if it were the true model, i.e. the uncertainty in previous model selection step is not reflected, thus it is possible that false discovery might be made based on the "over-confident" statistical inference.In this project, I take a bayesian point of view and propose a mixture prior to take into account the model selection uncertainty so that it will be reflected in final statistical inference. Non informative prior and adapted Gibbs sampler for the model proposed will be developed. I will illustrate the proposed model with combination between using cubic spline and thin plate spline when applying the smoothing spline methods to a two-dimensional data.





题目:Modeling High-dimensional Realized Covariance Matrix via Combination of VAR Model and Regularized Variable Selection Method

摘要:Large covariance matrix estimations have become fundamental problems in multivariate analysis, which could find applications in financial econometrics field. We consider Regularized estimators to reduce the dimensionality and next investigate the sources of these driving dynamics as well as the performance of the portfolio constructed with forecasted the realized covariance matrices by proposed model.


