Feature Screening with Large Scale and High Dimensional Censored Data
Conference
64th ISI World Statistics Congress
Format: IPS Abstract
Keywords: large, scale, survival
Session: IPS 78 - Statistical Challenges in Computational Advertising
Monday 17 July 4 p.m. - 5:25 p.m. (Canada/Eastern)
Abstract
Data with a huge size present great challenges in modeling, inferences, and computation. In handling big data, much attention has been directed to settings with “large p small n”, and relatively less work has been done to address problems with p and n being both large, though data with such a feature have now become more accessible than before. To carry out valid statistical analysis, it is imperative to screen out noisy variables that have no predictive value for explaining the outcome variable. In this talk, we present a screening method for handling large-sized survival data, where the sample size n is large and the dimension p of covariates is of non-polynomial order of the sample size. We rigorously establish theoretical results for the proposed method and conduct numerical studies to assess its performance. Our research offers multiple extensions of existing work and enlarges the scope of high-dimensional data analysis. The proposed method capitalizes on the connections among useful regression settings and offers a computationally efficient screening procedure.