TrungTin Nguyen

TrungTin Nguyen


Hi there and welcome! I am going to be a Postdoctoral Research Fellow in January 2022 at the Inria Grenoble-Rhône-Alpes Research Centre where I am very fortunate to be mentored by Senior Researcher Florence Forbes, ARC DECRA Research Fellow and Senior Lecturer Hien Duy Nguyen, and Associate Researcher Julyan Arbel. I am currently a Ph.D. Student in Statistics and Data Science at Normandie Univ, UNICAEN, CNRS, LMNO, Caen, France since 2018 where I am very fortunate to be advised by Professor Faicel Chamroukhi. During my Ph.D. research, I am also very fortunate to collaborate with Professor Geoff McLachlan focusing in mixture models. I received a Visiting PhD Fellowship at the Inria Grenoble-Rhône-Alpes Research Centre, working with Senior Researcher Florence Forbes and Associate Researcher Julyan Arbel in the Statify team under a project LANDER (from September 2020 to January 2021).

A central theme of my research focuses on Data Science, at the interface of:

  • Statistical learning: supervised, unsupervised and visualization of high-dimensional data, model selection in clustering and regression for functional and heterogeneous data, statistical convergence for deep hierarchical mixtures of experts (MoE), approximate Bayesian computation.
  • Machine learning: deep generative models (variational autoencoders, generative adversarial networks), reinforcement learning.
  • Optimization: robust and effective optimization algorithms for deep neural network (stochastic gradient descent, Adam,…), deep hierarchical MoE (generalized expectation–maximization (EM) algorithm, EM algorithm, MM algorithm,…), DC algorithm.
  • Biostatistics: statistical learning and machine learning for large biological data sets (omics data), e.g., genomics, transcriptomics and proteinomics.


  • Data Science
  • Statistics
  • Statistical Learning
  • Machine Learning
  • Optimization


  • Ph.D. Student in Statistics and Data Science, 2018-2021

    Université de Caen Normandie, France

  • M.S. in Applied Mathematics, 2017-2018

    Université d'Orléans, France

  • B.S. Honors Program in Mathematics and Computer Science, 2013-2017

    Vietnam National University-Ho Chi Minh Univeristy of Science, Vietnam


A non-asymptotic model selection in block-diagonal mixture of polynomial experts models

Model selection via penalized likelihood type criteria is a standard task in many statistical inference and machine learning problems. It has led to deriving criteria with asymptotic consistency results and an increasing emphasis on introducing non-asymptotic criteria. We focus on the problem of modeling non-linear relationships in regression data with potential hidden graph-structured interactions between the high-dimensional predictors, within the mixture of experts modeling framework. In order to deal with such a complex situation, we investigate a block-diagonal localized mixture of polynomial experts (BLoMPE) regression model, which is constructed upon an inverse regression and block-diagonal structures of the Gaussian expert covariance matrices. We introduce a penalized maximum likelihood selection criterion to estimate the unknown conditional density of the regression model. This model selection criterion allows us to handle the challenging problem of inferring the number of mixture components, the degree of polynomial mean functions, and the hidden block-diagonal structures of the covariance matrices, which reduces the number of parameters to be estimated and leads to a trade-off between complexity and sparsity in the model. In particular, we provide a strong theoretical guarantee$:$ a finite-sample oracle inequality satisfied by the penalized maximum likelihood estimator with a Jensen-Kullback-Leibler type loss, to support the introduced non-asymptotic model selection criterion. The penalty shape of this criterion depends on the complexity of the considered random subcollection of BLoMPE models, including the relevant graph structures, the degree of polynomial mean functions, and the number of mixture components.

Approximate Bayesian computation with surrogate posteriors

A key ingredient in approximate Bayesian computation (ABC) procedures is the choice of a discrepancy that describes how different the simulated and observed data are, often based on a set of summary statistics when the data cannot be compared directly. Unless discrepancies and summaries are available from experts or prior knowledge, which seldom occurs, they have to be chosen and this can affect the approximations. Their choice is an active research topic, which has mainly considered data discrepancies requiring samples of observations or distances between summary statistics, to date. In this work, we introduce a preliminary learning step in which surrogate posteriors are built from finite Gaussian mixtures using an inverse regression approach. These surrogate posteriors are then used in place of summary statistics and compared using metrics between distributions in place of data discrepancies. Two such metrics are investigated, a standard L2 distance and an optimal transport-based distance. The whole procedure can be seen as an extension of the semi-automatic ABC framework to functional summary statistics. The resulting ABC quasi-posterior distribution is shown to converge to the true one, under standard conditions. Performance is illustrated on both synthetic and real data sets, where it is shown that our approach is particularly useful when the posterior is multimodal.

An l1-oracle inequality for the Lasso in mixture-of-experts regression models

Mixture-of-experts (MoE) models are a popular framework for modeling heterogeneity in data, for both regression and classification problems in statistics and machine learning, due to their flexibility and the abundance of statistical estimation and model choice tools. Such flexibility comes from allowing the mixture weights (or gating functions) in the MoE model to depend on the explanatory variables, along with the experts (or component densities). This permits the modeling of data arising from more complex data generating processes, compared to the classical finite mixtures and finite mixtures of regression models, whose mixing parameters are independent of the covariates. The use of MoE models in a high-dimensional setting, when the number of explanatory variables can be much larger than the sample size (i.e., $p \gg n)$, is challenging from a computational point of view, and in particular from a theoretical point of view, where the literature is still lacking results in dealing with the curse of dimensionality, in both the statistical estimation and feature selection. We consider the finite mixture-of-experts model with soft-max gating functions and Gaussian experts for high-dimensional regression on heterogeneous data, and its $l_1$-regularized estimation via the Lasso. We focus on the Lasso estimation properties rather than its feature selection properties. We provide a lower bound on the regularization parameter of the Lasso function that ensures an $l_1$-oracle inequality satisfied by the Lasso estimator according to the Kullback-Leibler loss.


Recent & Upcoming Talks


3rd International Summer School on Deep Learning

Featured courses: Deep Generative Models by Aaron Courville, University of Montréal, Canada; Dive into Deep Learning by Alex Smola, Amazon, USA; Mathematics of Deep Learning, Rene Vidal, Johns Hopkins University, USA.
See certificate

Deep Learning Specialization

There are 5 Courses in this Specialization: Neural Networks and Deep Learning, Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization, Structuring Machine Learning Projects, Convolutional Neural Networks, and Sequence Models.
See certificate

Machine Learning