Analysis of Large Data Sets
Laura Freijeiro Gonzalez (Santiago de Compostela University)
Data spotkania seminaryjnego:
piątek, 9. Kwiecień 2021 - 15:30
The LASSO regression has been widely used in the high-dimensional linear regression model due to its capability of reducing the dimension of the problem. However, some rigid assumptions on the covariates matrix and sample size, as well as the sparsity nature of the coefficient vector, are needed so as to guarantee the good behavior of the algorithm. Apart from drawbacks related to the correct selection of covariates and bias. All these characteristics have been studied assuming independence, but not under dependence structures among covariates. To fill this gap, examples of these drawbacks are showed by means of a extensive simulation study, making use of different dependence scenarios. Besides, a broad comparison with LASSO derivatives and alternatives is carried out, resulting in some guidance about what procedures are the best in terms of the data nature.