MadLearning

Deep Learning Mathematics: From Theory to Applications

Preview

François Malgouyres, professor, Institut de Mathématiques de Toulouse et Université de Toulouse

The MadLearning project explores the geometry of neural networks, as well as its impact on the optimization landscape and on the regularization of learned functions. It analyzes how the properties of the objective function influence the trajectories of stochastic algorithms, as well as those of the straight-through estimator. The theoretical results obtained are compared with practice and enrich the design of high-performance and efficient State-Space Models (SSMs), particularly for applications in computer vision and time series modeling.

Keywords : Neural network geometry, Optimization landscape, Implicit regularization, Stochastic gradient algorithm, Direct estimator, State-Space Models, Quantized neural networks

Missions

Our researches

Geometric study of neural networks

Study the local dimension of the image using a neural network sample, when the network parameters vary. This local dimension makes it possible to characterize both the regularity of the learned function and the geometry of the objective function.

Analyze this local dimension for different architectures, such as State-Space Models (SSMs), Transformers, and ResNet.

Highlight the specific properties of neural networks with low local dimension.

Study of stochastic algorithms

Study the behavior of different stochastic algorithms when the objective function has varied structures, particularly flat valleys whose bottoms are composed of local minima.

Study of the Straight-Through-Estimator

The straight-through estimator is the preferred algorithm for optimizing the weights of neural networks when they are constrained to quantized values. This approach is essential for designing efficient and/or embeddable models. However, its performance and behavior remain poorly understood.

Analyze this estimator under different assumptions concerning the properties of the objective function in order to shed light on its mechanisms.

Application to SSMs

State-Space Models (SSMs) are neural network architectures that can solve certain tasks more efficiently than competing architectures, while offering reduced algorithmic complexity.

Adapt this architecture to applications in computer vision and time series modeling and processing.

Consortium

Université de Toulouse (UT), Ecole d’Economie et de science sociale de Toulouse (TSE), Université Grenoble Alpes (UGA), CNRS, Université de Lille, IRT Saint-Exupéry, Brown university

Scientific attempts

Societal impacts

Skills development

Publication

Autres projets

Géné-Pi

Mathematics of generative models

MacLeOD

Machine learning on geometries and distributions

MAGICALL

Mathematics of generative models: an interdisciplinary analysis of loss function landscapes

PERSNET

PERsistent Structures in Neural NETworks

PRODIGE-AI

PRObability, ranDom matrIx theory, Geometry and gEneralization for generative-AI

TENSOR4ML

TENSOR methods FOR mastering modern Machine Learning

THEOREM

Theory for more efficient generative models

Call for chairs Attractivités

The PEPR IA Research Program is opening its Call for Chairs Attractivité, aimed at junior and senior researchers, with the main criterion being an excellent track record in research in the PEPR IA themes.

NNawaQ

NNawaQ, Neural Network Adequate Hardware Architecture for Quantization (HOLIGRAIL project)

Package Python Keops

Package Python Keops for (very) high-dimensional tensor calculations (PDE-AI project)

MPTorch

MPTorch, a PyTorch-based framework for simulating and emulating custom precision DNN training (HOLIGRAIL project)

CaBRNeT

CaBRNeT, a library for developing and evaluating Case-Based Reasoning Models (SAIF project)

FloPoCo

FloPoCo (Floating-Point Cores), a generator of arithmetic cores and its applications to IA accelerators (HOLIGRAIL project)

SNN Software

SNN Software, Open Source Tools for SNN Design (EMERGENCES project)

SDOT

SDOT, A C++ and Python library for Semi-Discrete Optimal Transport (PDE-AI project)

Lazylinop

Lazylinop (Lazy Linear Operator), a high-level linear operator based on an arbitrary underlying implementation, (SHARP project)

CAISAR

CAISAR, a platform for characterizing artificial intelligence safety and robustness

P16

P16 or to develop, distribute and maintain a set of sovereign libraries for AI

AIDGE

AIDGE, the DEEPGREEN project's open embedded development platform

Jean-Zay

Jean Zay or the national infrastructure for the AI research community

ADAPTING

An approach that goes further than current hardware architectures, with the aim of reaching the next generation of AI applications.

Call of chairs Choose France – CNRS AI Rising Talents (closed call)

Call of chairs Choose France - CNRS AI Rising Talents (closed call)

CEA AI Rising Talents Grant

The CEA AI Rising Talents program offers you a tremendous opportunity to bring your ideas to life and lead your own research project for the benefit of industry and society.