- This event has passed.
MIDI Seminar: Faïcel CHAMROUKHI
March 16 2023 | 10h00 - 11h00
Title: Principled and interpretable learning via new mixtures for heterogenous high-dimensional and distributed data.
Faïcel CHAMROUKHI, IRT SystemX et Université de Caen.
Abstract: Modern machine learning algorithms deal with real-world data problems that arise in complex scenarios, including unlabeled heterogenous data, prediction with high-dimensional or functional predictors, and massive or distributed data. In this framework, we present a new family of mixture-of-experts (ME), that enjoy denseness properties and statistical estimation guarantees, and easier interpretation, to deal with heterogenous data with high-dimensional and functional inputs and in a distributed scenario. First, we present ME models with high-dimensional predictors or when the predictors are potentially noisy observations from entire functions, and Lasso-like regularizations with EM-Lasso optimization to provide sparse and interpretable representations. Then, we consider the situation in which the data are potentially massive and may be distributed for computational purposes, or are distributed by nature. We present a distributed learning approach for these ME models and an aggregation strategy based on optimal transport to aggregate local estimators fitted parallelly, and provide a reduced estimator that enjoys statistical guarantees and that is computationally effective.
Bio: Faïcel Chamroukhi is since sept. 2022 scientific responsible of data science and artificial intelligence at IRT SystemX, on a secondment from université de Caen where he is since sept. 2016 professor of statistics and data science. His primary research interests include statistical inference and machine learning, latent variable models and unsupervised learning in high-dimensional and large-scale scenarios.