posted on 2024-05-09, 16:51authored byCarlos Misael Madrid Padilla
This thesis presents a comprehensive examination of cutting-edge statistical methods for analyzing multidimensional data focusing on nonparametric function estimation through the popular technique called Trend Filtering. It delves into the application of this approach within temporal and spatial datasets, thereby enhancing statistical analysis across various domains including climatology, finance, neuroscience, agriculture, among others. Central to these explorations are two entwined main themes: the development of computationally efficient algorithms and the establishment of strong statistical guarantees for estimators across a broad array of frameworks.
In the first chapter, we outline the motivation, theoretical background, and critical challenges addressed by this thesis, providing a cohesive framework for the rest of the thesis. This introductory chapter sets the stage for a detailed exploration of innovative statistical methods, preparing the reader for the in-depth studies that follow.
The second chapter focuses on the estimation of a non-parametric regression function designed for data with simultaneous time and space dependencies. In such a context, we study the Trend Filtering, a nonparametric estimator introduced by [33] and [41]. To the best of our knowledge, this estimator has not previously been examined in a similar context. For univariate settings, the signals we consider are assumed to have a kth weak derivative with bounded total variation, allowing for a general degree of smoothness. In the multivariate scenario, we study a K-Nearest Neighbor fused lasso estimator as in [35], employing an ADMM algorithm, suitable for signals with bounded variation that adhere to a piecewise Lipschitz continuity criterion. By aligning with lower bounds, the minimax optimality of our univariate and multivariate estimators is validated. A unique phase transition phenomenon, previously uncharted in Trend Filtering studies, emerges through our analysis. Both simulation studies and real data applications underscore the superior performance of our method when compared with established techniques in the existing literature.
Finally, the third chapter summarizes the contributions of this thesis and provides possible directions for future work.