The glmnet
package uses a range of LASSO
tuning parameters lambda
scaled from the maximal lambda_max
under which no predictors are selected. I want to find out how glmnet
computes this lambda_max
value. For example, in a trivial dataset:
set.seed(1)
library("glmnet")
x <- matrix(rnorm(100*20),100,20)
y <- rnorm(100)
fitGLM <- glmnet(x,y)
max(fitGLM$lambda)
# 0.1975946
The package vignette (http://www.jstatsoft.org/v33/i01/paper) describes in section 2.5 that it computes this value as follows:
sx <- as.matrix(scale(x))
sy <- as.vector(scale(y))
max(abs(colSums(sx*sy)))/100
# 0.1865232
Which clearly is close but not the same value. So, what causes this difference? And in a related question, how could I compute lambda_max
for a logistic regression?
To get the same result you need to standardize the variables using a standard deviation with n
instead of n-1
denominator.
mysd <- function(y) sqrt(sum((y-mean(y))^2)/length(y))
sx <- scale(x,scale=apply(x, 2, mysd))
sx <- as.matrix(sx, ncol=20, nrow=100)
sy <- as.vector(scale(y, scale=mysd(y)))
max(abs(colSums(sx*sy)))/100
## [1] 0.1758808
fitGLM <- glmnet(sx,sy)
max(fitGLM$lambda)
## [1] 0.1758808
It seems lambda_max
for a logistic regression is calculated similarly, with weights based on class proportions:
set.seed(1)
library("glmnet")
x <- matrix(rnorm(100*20),100,20)
y <- rnorm(100)
mysd <- function(y) sqrt(sum((y-mean(y))^2)/length(y))
sx <- scale(x, scale=apply(x, 2, mysd))
sx <- as.matrix(sx, ncol=20, nrow=100)
y_bin <- factor(ifelse(y<0, -1, 1))
prop.table(table(y_bin))
# y_bin
# -1 1
# 0.62 0.38
fitGLM_log <- glmnet(sx, y_bin, family = "binomial")
max(fitGLM_log$lambda)
# [1] 0.1214006
max(abs(colSums(sx*ifelse(y<0, -.38, .62))))/100
# [1] 0.1214006
For your second question, look to Friedman et al's paper, "Regularization paths for generalized linear models via coordinate descent". In particular, see equation (10), which is equality at equilibrium. Just check under what conditions the numerator $S(\cdot,\cdot)$ is zero for all parameters.
According to help("glmnet")
the maximal lambda value is "the smallest value for which all coefficients are zero":
sum(fitGLM$beta[, which.max(fitGLM$lambda)])
#[1] 0
sum(glmnet(x,y, lambda=max(fitGLM$lambda)*0.999)$beta)
#[1] -0.0001809804
At a quick glance the value seems to be calculated by the Fortran code called by elnet
.