TrungTin Nguyen is an applied statistician and mathematician whose research focuses on elucidating the mechanisms and interactions between natural and artificial intelligence, with applications including the construction of virtual biological cells and the development of trustworthy and robust foundational artificial intelligence systems. His current work involves developing genotype-phenotype mapping methods for high-dimensional, heterogeneous biological data, leveraging artificial intelligence techniques (such as modular deep learning, including mixture of experts, large language models, and physics-/biologically informed neural networks) alongside statistical learning approaches (including Bayesian inference, model selection, handling of missing data, and probabilistic graphical models). His research also focuses on developing mathematical and statistical frameworks for calibrating large-scale scientific machine learning models using experimental data, as well as advancing methods for the integration, interpretation, and application of multi-omics data.
Hello and welcome! My Vietnamese name is Nguyễn Trung Tín. I therefore used “TrungTin Nguyen” or “Trung Tin Nguyen” in my English publications. The first name is also “Tín” or “Tin” for short. This is my personal website. For my academic profile, please refer to Dr TrungTin Nguyen's Institutional Website.
I am currently a MACSYS Postdoctoral Research Fellow (Applied Statistics) at the Queensland University of Technology in the School of Mathematical Sciences and the ARC Centre of Excellence for the Mathematical Analysis of Cellular Systems (MACSYS), starting from February 2025, where I am very fortunate to be mentored by Christopher Drovandi.
I was a Postdoctoral Research Fellow at The University of Queensland in the School of Mathematics and Physics from December 2023 to December 2024, where I was very fortunate to be mentored by Hien Duy Nguyen, and Xin Guo. Before going to Queensland, I was a Postdoctoral Research Fellow at the Inria centre at the University Grenoble Alpes in the Statify team, where I was very fortunate to be mentored by Florence Forbes, Julyan Arbel, and collaborated with Hien Duy Nguyen as part of an international project team WOMBAT. I completed my Ph.D. Degree in Statistics and Data Science at Normandie Univ in December 2021, where I was very fortunate to have been advised by Faicel Chamroukhi. During my Ph.D. research, I am grateful to collaborate with Hien Duy Nguyen, and Geoff McLachlan. I received a Visiting PhD Fellowship for 4 months at the Inria centre at the University Grenoble Alpes in the Statify team within a project LANDER.
A central theme of my research is data science, at the intersection of:
Statistical learning: Model selection (minimal penalties and slope heuristics, non-asymptotic oracle inequalities), simulation-based inference (approximate Bayesian computation, Bayesian synthetic likelihood, method of moments), Bayesian nonparametrics (Gibbs-type priors, Dirichlet process mixture), high-dimensional statistics (variable selection via Lasso and penalization, graphical models), uncertainty estimation, missing data (imputation methods, likelihood-based approaches with missing data).
Machine learning: Supervised learning (deep hierarchical mixture of experts (DMoE), deep neural networks), unsupervised learning (clustering via mixture models, dimensionality reduction via principal component analysis, deep generative models via variational autoencoders, generative adversarial networks and normalizing flows), reinforcement learning (partially observable Markov decision process), structured prediction (probabilistic graphical models), scientific machine learning (physics-informed/biologically informed neural networks).
Optimization: Robust and effective optimization algorithms for mixture models (MM algorithm, expectation–maximization, variational Bayesian inference, Markov chain Monte Carlo methods), difference of convex algorithm, optimal transport (Wasserstein distance, voronoi loss function).
Applications: Natural language processing (large language model), remote sensing (planetary science, e.g., retrieval of Mars surface physical properties from hyper-spectral images), signal processing (sound source localization), biological data (genomics, transcriptomics, proteomics, cellular systems), computer vision (image segmentation), quantum chemistry, drug discovery, and materials science (supervised and unsupervised learning on molecular modeling).