Statistical models specification under a kernel-based approach
Statistical models specification under a kernel-based approach

Neste workshop se tratarán temas de interese actual no ámbito da especificación de modelos estatísticos baixo un enfoque baseado en kernel.
A actividade está organizada polo:
- Grupo Modelos de Optimización, Decisión, Estatística e Aplicacións (MODESTYA), do Departamento de Estatística, Análise Matemático e Optimización da Facultade de Matemáticas (USC)
- Centro de Investigación e Tecnoloxía Matemática de Galicia (CITMAga)
Data: 21 de marzo de 2025 de 10:00h a 13:00h.
Localización: Salón de Graos da Facultade de Matemáticas (USC).
Programa:
Horario | Relator | Título | Abstract |
10:00-11:00 | Pedro Galeano (Universidad Carlos III de Madrid) |
A generalized distance covariance approach for testing linearity and independence in scalar-on-function regression with missing at random responses | "Scalar-on-function regression models are used to predict a scalar response from a functional covariate. We propose a test for simultaneously testing the linearity of the relationship between a scalar response and a functional covariate, and the independence between the functional covariate and the error term of the model, in the presence of Missing At Random (MAR) responses. The test statistic is just the generalized distance covariance between the functional covariate and the residuals from a linear model fit. To handle MAR responses, we consider three functional principal component (FPC)-based slope estimation methods: the simplified method, which discards observations with missing responses, potentially losing valuable information; the imputed method, which utilizes more data by filling in missing responses using the simplified slope estimate; and the inverse probability weighted method, which goes beyond imputation by accounting for the missing data mechanism, assigning weights to observations based on their missingness probability. We use cross-validation to select optimal FPCs for each method. The distribution of the test statistic under the null hypothesis is calibrated using residual bootstrap. Monte Carlo simulations indicate that the simplified method yields less powerful tests, while imputation-based methods perform significantly better. We illustrate our approach by analyzing a model explaining average daily temperatures using the average number of sunny days at Spanish meteorological stations."
|
11:00-11:30 | Coffee break | ||
11:30-12:00 | María Vidal García (Universidade de Santiago de Compostela) |
A kernel-based specification test for the regression function | "Over the last decade there has been considerable progress in the development of statistical methods relying on the kernel approach. The kernel approach arises in the field of Machine Learning when dealing with classification problems, and has become an important tool in Statistics. This methodology has been applied to problems such as the two sample problem, testing independence (using the associated notions of energy distance and distance correlation) and goodness-of-fit tests for the distribution function.This work proposes a kernel-based approach to the goodness-of-fit test for the regression function in the context of scale-location models. Following previous literature, it relies on expressing the original contrast as a contrast over the distribution of the residuals. These ideas have already been successfully applied in two proposals based on a Kolmogorov-Smirnov type of statistic and a Cramér-von Mises type of statistic respectively. The goal is to provide a new alternative using a different notion of distance, the maximum mean discrepancy, which arises when embedding the initial observations into a Reproducing Kernel Hilbert Space. A review of the advantages of such a distance and of the main assumptions required for this construction to work is first provided. Next, the new proposal is presented along with its theoretical properties. A final section is included to explain the issues regarding implementation. As is often the case, an implementation based on the theoretical asymptotic distribution leads to poor computational results, so resampling methods are preferred. Both residual bootstrap and smooth residual bootstrap accounting for heteroskedasticity are implemented. A simulation study is presented to assess the performance of the proposed method and compare it with alternatives in the literature." |
12:00-12:30 | Laura Freijeiro González (Universidad de Oviedo) |
New proposals of specification tests for synchronous and asynchronous functional concurrent models | "Functional concurrent models are a special case of functional regression models where both, response (Y) and covariates (X), are functions of the same argument t and that the relation between both is concurrent or point-by-point. For these, one can distinguish between synchronous (SFCM) or asynchronous (AFCM) designs. This depends on whether the observed curves concerning Y(t), as well as X(t), are measured in equal or distinct time instants for each individual, respectively. In the existing literature, it is often assumed that a specific structure (linear, additive, etc.) exists between Y(t) and X(t) to estimate the regression model. However, this assumption could be pretty restrictive and hard to verify in practice. Additionally, estimation of the model can be a tough problem when there are multiple covariates. To address this issue, we propose new specification tests for dimensionality reduction in the general FCM formulation. For SFCM, we extend the martingale difference divergence (MDD) approach, resulting in novel specification tests for the conditional expectation of Y given X. We also provide some insights for future extension to the quantile structure. Concerning the AFCM, we adapt the conditional distance covariance (CDC) ideas of Wang et al. (2015) to develop new specification tests for the general formulation of the AFCM".
Shao, X. and Zhang, J. (2014). Martingale difference correlation and its use in high-dimensional variable screening. Journal of the American Statistical Association, 109(507):13021318. Wang, X., Pan, W., Hu, W., Tian, Y., and Zhang, H. (2015). Conditional distance correlation. Journal of the American Statistical Association, 110(512):1726–1734.
|
12:30-13:00 | Daniel Diz Castro (Universidade de Santiago de Compostela) |
A new model-free covariate significance test for weak dependent data | "Significance tests are invaluable tools for detecting redundant covariates in regression, enabling the proposal of simpler, more parsimonious models. This is particularly relevant in nonparametric regression with high dimensional or functional data due to the effects of the curse of dimensionality on estimation and inference procedures. A new kernel-based test, designed to be robust against the curse of dimensionality, is presented in this contribution. We provide some results establishing the asymptotic behavior of the proposed test statistic under weak dependence assumptions and we also illustrate the finite sample performance of the test trough simulations". |
Rexistro (preme aquí).
Conexión virtual (preme aquí).