16-17 sept. 2021 Fontainebleau (France)
A Statistical Learning View of Simple Kriging. Connection with Kernel Ridge Regression
Emilia Siviero  1@  , Emilie Chautru  2@  , Stéphan Clémençon  3, 4@  
1 : Télécom Paris  -  Site web
LTCI, Télécom ParisTech
19 Place Marguerite Perey 91120 Palaiseau -  France
2 : Mines ParisTech, centre de Géosciences, équipe Géostatistique
MINES ParisTech, PSL Research University
3 : Telecom Paris
Télécom ParisTech
4 : Laboratoire Traitement et Communication de l'Information [Paris]  (LTCI)  -  Site web
Télécom ParisTech, CNRS : UMR5141
CNRS LTCI Télécom ParisTech 46 rue Barrault F-75634 Paris Cedex 13 -  France

The practice of machine learning has been successfully developed these last decades with the design of many efficient algorithms (e.g. boosting methods, SVM, deep neural networks) for carrying out various tasks such as classification, regression or clustering. It is supported by a sound probabilistic theory, essentially relying on the theory of empirical processes, i.e. collections of independent and identically distributed averages. In the Big Data era, we are facing situations where the massive datasets contain geolocated, spatially dependent observations. In this context, the usual theory of statistical learning does not provide any theoretical guarantee of the generalization capacity of rules learnt from data. We consider here the simple kriging task, the flagship problem in geostatistics: the values of a square integrable random field $X=\{X_s\}_{s\in S}$, $S\subset \mathbb{R}^2$, with unknown covariance structure are to be predicted with minimum quadratic risk, based upon observing a single realization of the spatial process at a finite number of locations. The connection of this minimization problem with kernel ridge regression is highlighted, as well as the difficulties faced when trying to establish the generalization capacity of empirical risk minimizers. Particular attention is paid to a seminal example: the isotropic stationary Gaussian case. There, data collection is assumed to be performed at every point of a regular meshgrid, spanning the supposedly compact spatial domain $S$ (in-fill setup). In this specific context, nonasymptotic bounds are proved for the excess risk of a plug-in predictive rule mimicking the true minimizer. Numerical experiments illustrate their validity and the role of each technical assumption. Though the latter may be restrictive, this result shows that simple kriging can be considered not only through the common parametric geostatistical modelling approach, but also from a predictive perspective, in a sound validity framework. This paves the way for further theoretical developments in statistical learning based on spatial data.


Personnes connectées : 21 Vie privée
Chargement...