Statistical Machine Learning, Statistical Modeling, Big Data Analytics, Data Visualization, and Bioinformatics.
Time series, stochastic processes, risk analysis, artificial intelligence, econometrics
Factor models, Gaussian process, high-dimensional data, large contingency tables
My main general methods currently are in developing statistical methods within the frameworks of semiparametric and functional regression, understanding the structure of variability, and in latent variables especially as they arise in the case that important variables are measured with error and subject to excess zeros. My main application interest lies in problems of nutrition and physical activity, both at the molecular level and in the individual level. This has led me recently to considering problems of gene-environment interactions (where nutrition is the environment) and their effect on cancer, understanding the nature of dietary intake patterns in humans and the effect of those patterns on a host of diseases, and most recently, whether increasing physical activity increases the mean and decreases the variance of sleep efficiency
My research interests broadly lie at the interface of semi-parametric inference, high dimensional statistics and statistical learning in semi-supervised settings and weakly supervised settings, with applications in the statistical analysis of large and complex observational datasets arising in modern studies, especially biomedical studies such as electronic medical records (EMR). Some of my specific research interests are listed below.
Methodology: Semi-supervised inference; Semi-parametric inference with high dimensional data; Missing data and causal inference; High dimensional inference; Regularized estimation; Non-asymptotic performance guarantees.
Applications: Discovery research using EMR data; Automated phenotyping; Personalized medicine (treatment selection, treatment effects estimation, risk prediction, comparative effectiveness etc.)
Others: Concentration inequalities and tail bounds; Empirical processes; Debiasing and sample-splitting; Model misspecification; Non-parametric regression; Sufficient dimension reduction.
Long memory time series, econometrics
My research interests fall under the general heading “applied probability and stochastic processes”. Put simply, I like to study the probabilistic structure and properties of processes that statisticians and others want to observe and analyze. For example, nonlinear time series models are being applied to many observed times series, using statistical methods that assume some type of stability – my work looked at how one can verify such stability for specific models. More recently, I and colleagues from computer science are investigating properties of dynamic random networks with an interest in optimizing some of those properties. I am also interested in questions of real analysis that are related to or devolve from statistical problems. This has included looking at the distributions of heavy-tailed random variables, which is relevant for risk theory and extreme values of data, and at the relationship between discontinuous functions and their Fourier transforms, which was of interest for function estimation
Statistics and online education, consulting, regression, and intro theory
Statistics education and applications to epidemiology/health care data
Microarrays, bioinformatics, classification methods, statistical education
High-dimensional data analysis, machine learning, multivariate analysis, computational statistics, statistical methods for analyzing biological data
My interests include nonparametric function estimation, hypothesis testing in complex settings, time series analysis and Bayesian methods. Recently I have focused attention on inference problems involving a large number of small data sets. Suppose, for example, that the distribution of data within different small data sets is the same up to location and scale, with location and scale differing randomly from one data set to the next. I am interested in methods for estimating the distribution common to all data sets, and also the distribution of location and scale across data sets. I have considered both frequentist (minimum distance) and Bayesian methods for tackling this problem.
Another problem involving a large number of small data sets is testing the equality of distributions across all data sets. This is like the classical k-sample problem, but with the key difference that instead of fixing k and allowing sample sizes to increase without bound, I let k tend to infinity with sample sizes fixed. Doing so leads to different and interesting asymptotics in this and other inference problems.
I am also interested in simulation methods that involve generating many different models randomly rather than generating large numbers of data sets from just a few models. I call this approach BayesSim since the distribution from which models are selected is analogous to the prior distribution in Bayesian methodology.
Statistics education, consulting
Statistical Learning and Machine Learning and Statistical Education
Nonparametric and semiparametric methods, statistical function estimation using polynomial splines, statistical methods for longitudinal data/panel data, multivariate/functional data analysis, survival analysis, duration data, event history analysis, statistics application in business
My current methodological research interests focus on problems related to Bayesian hypothesis testing, Bayesian variable selection in ultra-high dimensional spaces, and latent variable models for ordinal and rank data analyses. In the areas of hypothesis testing and variable selection, I am particularly interested in exploring efficiencies that can be gained through the use of non-local prior densities to specify either alternative hypotheses in hypothesis testing problems or the non-null distributions of regression coefficients in variable selection problems. My research on ordinal and rank data modeling finds application in evaluating the intelligence of non-human primate species and in educational assessment.
Applied problems that currently interest me include statistical studies of the impacts of college admission policies on diversity of college campuses and graduation rates and developing more meaningful instruments to evaluate undergraduate and graduate teaching. More generally, I am interested in studying the impact that assessment procedures have on educational processes.
Finally, I am intrigued by the problem of performing inference in agent models used in sociology and psychology.
My current research interests include statistical analysis of high-throughput genomic datasets, variable selection in large p small n problems, multi-regional clinical trial and functional data analysis. I am also interested in Bayesian subspace estimation and sufficient dimension reduction
My research interests are mainly in Bayesian spatial statistics, with applications in the environmental sciences. As remote-sensing instruments mounted on satellites have made it possible to collect massive amounts of data on a global scale, much of my research focuses on the development of complex, flexible spatial methods that can be applied to big global datasets in a computationally feasible way. For example, I work with collaborators at NASA and NCAR on combining measurements from several satellites measuring CO2 on a global scale, on how to run related algorithms in parallel in modern distributed-computing environments, and on the real-time analysis of massive, streaming spatio-temporal datasets that are important for forecasting severe rainfall
Programming efficiency, Data Quality
Spatial statistics, statistics education, consulting
Hwa Chi Liang
Linear Models, Statistical Education, Undergraduate Research
Collaborative research to provide in-depth statistical analysis across various fields and work with multidisciplinary scientific staff
My research is collaborative research with faculty members in entomology, animal science, and numerous other departments throughout the Texas A&M campus. My contributions consist of helping researchers design experiments, determine sample sizes, decide on appropriate models, select statistical methodology to analyze their data, and produce insightful graphs. This, hopefully, results in efficient designs, more powerful tests, and improved explanations of the results
Bayesian hierarchical modeling, nonparametric regression and classification, bioinformatics, spatio-temporal modeling, machine learning, functional data analysis, Bayesian nonparametrics, petroleum reservoir characterization, uncertainty analysis of computer model outputs
H. Joseph Newton
Time series analysis, computational statistics
Time Series, Econometrics and Finance, Nonparametic and Semiparametric Models, Multivariate, Functional and High-Dimensional Data and Biostatistics and Bioinformatics
H. Joseph Newton
Time series analysis, computational statistics
Methodology: Graphical models, Bayesian nonparametrics, big data computation, machine learning, random networks, variable selection, clustering and feature allocation, classification.
Science: Gene/protein networks reconstruction, integrative genomics, brain connectomics, clinical trial design, tumor heterogeneity, precision medicine, biomarker detection, genetics, neuroscience, electronic health records
My research involves developing Bayesian methods for complex objects including high-dimensional sparse vectors, matrices, shapes of non-Euclidean objects and large graphs. I am also interested in studying Bayesian model selection consistency under complex settings. Modeling the distributions of objects contained within images motivated some of my collaborative work, e.g., in applications of tumor tracking in targeted radiation therapy. More recently, I have become interested in building models for discovering patterns in large networks and to predict cognition from connectomics data
My research is focused on modeling dependence (covariances) in multivariate and time series data using covariates. The goal is to develop machinery for covariance matrices just like the powerful generalized linear models (GLM) for the mean vector developed over two centuries. The key ideas and tools I rely on are from prediction theory, time series analysis and theory of stochastic processes. My interest in applications includes financial data analysis, analysis of longitudinal and panel data, data mining, classification and clustering, fMRI and high-dimensional data
Bayesian statistics with a focus on spatial and spatio-temporal statistics
Methodological research: missing data technique, measurement error, splines. Bayesian methods: parametric and nonparametric methods. Application: epidemiology, genetic epidemiology
Suhasini Subba Rao
Time series, nonstationary processes, nonlinear processes, recursive online algorithms, spatio-temporal models
General Bayesian Statistical Methodology, Statistical Modeling and Simulation of High Frequency Biomedical and Environmental Data Wavelets, Multiscale Statistical Modeling, Denoising, Assessing of Scaling in Signals and Images, Biostatistics, Clinical Trials, Bioinformatics Statistics of Turbulent Flows, Environmental Statistics
Biostatistical inferences, missing and mis-measured data modeling and analysis, non- and semi-parametric methodology, resampling methods, small sample asymptotics, survey sampling
Degree Growth and Index Estimation for Network Models. Explore the limit distribution for the degree growth of a fixed node in preferential attachment network models. Provide theoretical justifications for a widely used tail index estimation method for network models
Stochastic models, directional data, mathematical statistics, nonparametric function estimation
Raymond Ka Wai Wong
Nonparametric and semi-parametric modeling; Regularization methods; Statistical applications to astronomy, brain imaging, computer experiments and recommender systems; Statistical learning
High dimensional statistics, Econometrics, Functional data analysis, and Time series analysis
Statistical methodology and application in bioinformatics, nutrition and epidemiology, functional/longitudinal data analysis
Theory: mathematical statistics, Bayesian methodology, variable selection, stochastic optimization, nonlinear expectation, iterative algorithms. Applications: statistical genetics, clinical trials, genomics, mathematical finance