Approximation and non-asymptotic model selection in mixture of experts models

Image credit: TrungTin Nguyen


Mixtures of experts (MoE) models are a ubiquitous tool for the analysis of heterogeneous data across many fields including statistics, bioinformatics, pattern recognition, economics, and medicine, among many others. They provide conditional constructions for regression in which the mixture weights, along with the component densities, are explained by the predictors, allowing for flexibility in the modeling of data arising from complex data generating processes. In this work, we consider the Gaussian-gated localized MoE (GLoME) regression model for modeling heterogeneous data. This model poses challenging questions with respect to the statistical estimation and model selection problems both from the computational and theoretical points of view. We study the problem of estimating the number of components of the GLoME model in a penalized maximum likelihood estimation framework. We provide a lower bound on the penalty that ensures a weak oracle inequality is satisfied by our estimator. In particular, these results provide a strong theoretical guarantee, a finite-sample oracle inequality satisfied by the penalized maximum likelihood estimator with a Jensen–Kullback–Leibler type loss, to support the slope heuristic criterion in a finite sample setting, compared to the classical asymptotic criteria. This allows the calibration of penalty functions, known up to a multiplicative constant, and to the complexity of the considered random collection of MoE models, including the number of mixture components. To support our theoretical result, we perform numerical experiments on simulated and real data, which illustrate the performance of our finite-sample oracle inequality.

Sep 30, 2021 — Oct 1, 2021
INSA Rouen Normandie, Rouen, France
TrungTin Nguyen
TrungTin Nguyen

A central theme of my research focuses on Data Sciences, at the interface of Statistical Learning, Machine Learning, and Optimization.