# Biography

Hi there and welcome! I am currently a Postdoctoral Fellow at the Inria Grenoble-Rhône-Alpes where I am very fortunate to be mentored by Senior Researcher Florence Forbes, Senior Lecturer Hien Duy Nguyen, and Associate Researcher Julyan Arbel. I finished my Ph.D. Degree in Statistics and Data Science at Normandie Univ, UNICAEN, CNRS, LMNO, Caen, France in December 2021 where I am very fortunate to be advised by Professor Faicel Chamroukhi and Senior Lecturer Hien Duy Nguyen. During my Ph.D. research, I am also very fortunate to collaborate with Professor Geoff McLachlan focusing in mixture models. I received a Visiting PhD Fellowship at the Inria Grenoble-Rhône-Alpes Research Centre, working with Senior Researcher Florence Forbes and Associate Researcher Julyan Arbel in the Statify team under a project LANDER (from September 2020 to January 2021).

A central theme of my research focuses on Data Science, at the interface of:

• Statistical learning: supervised, unsupervised and visualization of high-dimensional data, model selection in clustering and regression for functional and heterogeneous data, statistical convergence for deep hierarchical mixtures of experts (MoE), approximate Bayesian computation, Bayesian nonparametrics.
• Machine learning: deep generative models (variational autoencoders, generative adversarial networks), reinforcement learning, optimal transport (Wasserstein distance).
• Optimization: robust and effective optimization algorithms for deep neural network (stochastic gradient descent, Adam,…), deep hierarchical MoE (expectation–maximization (EM) algorithm, generalized EM algorithm, variational Bayesian EM algorithm, majorization-minimization (MM) algorithm), DC algorithm.
• Biostatistics: statistical learning and machine learning for large biological data sets (omics data), e.g., genomics, transcriptomics and proteinomics.

### Interests

• Data Science
• Statistics
• Statistical Learning
• Machine Learning
• Optimization

### Education

• Ph.D. in Statistics and Data Science, 2018-2021

Université de Caen Normandie, France

• M.S. in Applied Mathematics, 2017-2018

Université d'Orléans, France

• B.S. Honors Program in Mathematics and Computer Science, 2013-2017

Vietnam National University-Ho Chi Minh Univeristy of Science, Vietnam

# Publications

(2022). Summary statistics and discrepancy measures for approximate Bayesian computation via surrogate posteriors. Statistics and Computing.

(2022). A non-asymptotic approach for model selection via penalization in high-dimensional mixture of experts. Electronic Journal of Statistics.

(2022). Mixture of expert posterior surrogates for approximate Bayesian computation. 53èmes Journées de Statistique de la Société Française de Statistique (SFdS).

(2022). Model selection by penalization in mixture of experts models with a non-asymptotic approach. 53èmes Journées de Statistique de la Société Française de Statistique (SFdS).

(2022). Approximation of probability density functions via location-scale finite mixtures in Lebesgue spaces. Communications in Statistics - Theory and Methods.

(2021). Approximations of conditional probability density functions in Lebesgue spaces via mixture of experts models. Journal of Statistical Distributions and Applications.

(2021). A non-asymptotic model selection in block-diagonal mixture of polynomial experts models. arXiv preprint arXiv:2104.08959.

(2020). An l1-oracle inequality for the Lasso in mixture-of-experts regression models. arXiv preprint arXiv:2009.10622. Under revision, ESAIM: Probability and Statistics.

(2020). Approximation by finite mixtures of continuous density functions that vanish at infinity. Cogent Mathematics & Statistics.

# Recent & Upcoming Talks

### Model selection by penalization in mixture of experts models with a non-asymptotic approach.

This study is devoted to the problem of model selection among a collection of Gaussian-gated localized mixtures of experts models …

### A non-asymptotic approach for model selection via penalization in high-dimensional mixture of experts models.

Mixture of experts (MoE) are a popular class of statistical and machine learning models that have gained attention over the years due …

### A non-asymptotic approach for model selection via penalization in mixture of experts models

Mixture of experts (MoE), originally introduced as a neural network, is a popular class of statistical and machine learning models that …

### A non-asymptotic model selection in mixture of experts models

Mixture of experts (MoE), originally introduced as a neural network, is a popular class of statistical and machine learning models that …

### Model Selection and Approximation in High-dimensional Mixtures of Experts Models$:$ From Theory to Practice

Mixtures of experts (MoE) models are a ubiquitous tool for the analysis of heterogeneous data across many fields including statistics, …

### Model Selection and Approximation in High-dimensional Mixtures of Experts Models From Theory to Practice

Mixtures of experts (MoE) models are a ubiquitous tool for the analysis of heterogeneous data across many fields including statistics, …

### Approximation and non-asymptotic model selection in mixture of experts models

Mixtures of experts (MoE) models are a ubiquitous tool for the analysis of heterogeneous data across many fields including statistics, …

### Approximate Bayesian computation with surrogate posteriors

A key ingredient in approximate Bayesian computation (ABC) procedures is the choice of a discrepancy that describes how different the …

### A non-asymptotic model selection in mixture of experts models

Mixture of experts (MoE) is a popular class of models in statistics and machine learning that has sustained attention over the years, …

### Approximate Bayesian computation with surrogate posteriors

A key ingredient in approximate Bayesian computation (ABC) procedures is the choice of a discrepancy that describes how different the …

### Distance-based ABC procedures

Approximate Bayesian computation (ABC) has become an essential part of the Bayesian toolbox for addressing problems in which the …

### Non-asymptotic penalization criteria for model selection in mixture of experts models

Mixture of experts (MoE) is a popular class of models in statistics and machine learning that has sustained attention over the years, …

### Approximate Bayesian computation with surrogate posteriors

A key ingredient in approximate Bayesian computation (ABC) procedures is the choice of a discrepancy that describes how different the …

# Achievements

#### 3rd International Summer School on Deep Learning

Hours of lectures: 39.

Featured courses:

• Deep Generative Models by Aaron Courville, University of Montréal, Canada.

• Dive into Deep Learning by Alex Smola, Amazon, USA.

• Mathematics of Deep Learning, Rene Vidal, Johns Hopkins University, USA.

See certificate

#### Learn French online

Accomplishment: 19 French courses from Newcomer to Advanced Course based on level B2-C1 of the Common European Framework of Reference for Languages (CEFR).

Description: Learn French online help you to strengthen your French skills, whatever your motivation to learn. Select a guided learning path based on your skill level, or choose a work-, travel- or culture-focused course.

See certificate

#### Conference Reviewing - Program Committee

Description: The Research School on Statistics and Data Science 2019 (RSSDS2019) provides a series of lectures and posters delivered by experienced academic and industry members on the cutting edge statistics and data science applications and theory. Aside from being considered among the most desirable jobs of the 21st century, the real world value of data science and skilled data scientists cannot be understated. The most successful data science professionals are those who can hone the power of statistics, programming, domain knowledge and data visualisation. The workshops at RSSDS2019 will empower current and upcoming data scientists (and enthusiasts) with new knowledge, understanding and exploration into many varied and interesting data domains, alongside the opportunity to network with many others involved at various levels in the exciting data science scene.
See certificate

#### Soutien linguistique en français SLF

Niveau acquis: A1+ selon le Cadre européen commun de référence pour les langues (CECR).

Description: Le Soutien linguistique en français est un programme semestriel de cours du soir d’apprentissage de la langue française et de soutien linguistique en français sur objectifs universitaires. Il est principalement proposé aux étudiants et enseignants-chercheurs inscrits à l’université de Caen Normandie en complément de leurs formations disciplinaires et ou de leurs recherches en laboratoire. Le programme est également ouvert aux particuliers dont les activités professionnelles et ou la situation familiale ne seraient pas compatibles avec les emplois du temps de nos formations intensives de type DUEF. Les stagiaires doivent être titulaires d’un diplôme ou titre donnant accès à l’enseignement supérieur en France.

See certificate

#### Ph.D. in Statistics and Data Science

Co-advisor: Senior Lecturer Hien Duy Nguyen (University of Queensland).

Thesis title: Model selection and approximation in high-dimensional mixtures of experts models$:$ from theory to practice.

Manuscript: PDF.

Slide: PDF.

Code: GitHub.

See certificate

#### Deep Learning Specialization

Instructor: Andrew Ng. (Stanford University)

There are 5 Courses in this Specialization:

• Neural Networks and Deep Learning (Grade: 100%).

• Improving Deep Neural Networks: Hyperparameter tuning (Grade: 100%).

• Regularization and Optimization, Structuring Machine Learning Projects (Grade: 98.3%).

• Convolutional Neural Networks (Grade: 98.9%).

• Sequence Models (Grade: 100%).

Description: The Deep Learning Specialization is a foundational program that will help you understand the capabilities, challenges, and consequences of deep learning and prepare you to participate in the development of leading-edge AI technology. In this Specialization, you will build and train neural network architectures such as Convolutional Neural Networks, Recurrent Neural Networks, LSTMs, Transformers, and learn how to make them better with strategies such as Dropout, BatchNorm, Xavier/He initialization, and more. Get ready to master theoretical concepts and their industry applications using Python and TensorFlow and tackle real-world cases such as speech recognition, music synthesis, chatbots, machine translation, natural language processing, and more. AI is transforming many industries. The Deep Learning Specialization provides a pathway for you to take the definitive step in the world of AI by helping you gain the knowledge and skills to level up your career. Along the way, you will also get career advice from deep learning experts from industry and academia.

See certificate

#### Master of Science, Technology and Health

Master 2 Program of Mathematics. Major in Applied Mathematics.

GPA: $18.0/20.0$.

Mention: Très Bien.

See certificate

#### The International English Language Testing System (IELTS)

Overall Band Score: $7.0/9.0$.

CEFR Level: C1.

Description: IELTS – the International English Language Testing System – is the world’s most popular English language test. It is developed by some of the world’s leading experts in language assessment and evaluates all of your English skills — reading, writing, listening and speaking. The test reflects how you’ll use English to study, work and live in an English speaking environment. You can take the test at any of our official test centres across the world.

See certificate

#### The Test of English for International Communication (TOEIC) Listening and Reading Test

Total Score: $795/990$ (Listening: $410/495$ + Reading: $385/495$).

Description: The TOEIC Listening and Reading Test measures listening and reading skills for beginner to advanced levels of English.

See certificate

#### Machine Learning

Instructor: Andrew Ng. (Stanford University).

Description: Machine learning is the science of getting computers to act without being explicitly programmed. In the past decade, machine learning has given us self-driving cars, practical speech recognition, effective web search, and a vastly improved understanding of the human genome. Machine learning is so pervasive today that you probably use it dozens of times a day without knowing it. Many researchers also think it is the best way to make progress towards human-level AI. In this class, you will learn about the most effective machine learning techniques, and gain practice implementing them and getting them to work for yourself. More importantly, you’ll learn about not only the theoretical underpinnings of learning, but also gain the practical know-how needed to quickly and powerfully apply these techniques to new problems. Finally, you’ll learn about some of Silicon Valley’s best practices in innovation as it pertains to machine learning and AI. This course provides a broad introduction to machine learning, datamining, and statistical pattern recognition. The course will also draw from numerous case studies and applications, so that you’ll also learn how to apply learning algorithms to building smart robots (perception, control), text understanding (web search, anti-spam), computer vision, medical informatics, audio, database mining, and other areas.

Topics include:

(i) Supervised learning (parametric or non-parametric algorithms, support vector machines, kernels, neural networks).

(ii) Unsupervised learning (clustering, dimensionality reduction, recommender systems, deep learning).

(iii) Best practices in machine learning (bias/variance theory; innovation process in machine learning and AI).

See certificate

#### Bachelor of Science

Honors Program in Mathematics and Computer Science. Major in Probability and Statistics, minor in Numerical Analysis.

GPA: $9.17/10.0$.

Rank: $2/1557$. Summa Cum Laude.

See certificate