smile.regression.LASSO

public class LASSO extends Object

Lasso (least absolute shrinkage and selection operator) regression. The Lasso is a shrinkage and selection method for linear regression. It minimizes the usual sum of squared errors, with a bound on the sum of the absolute values of the coefficients (i.e. L₁-regularized). It has connections to soft-thresholding of wavelet coefficients, forward stage-wise regression, and boosting methods.

The Lasso typically yields a sparse solution, of which the parameter vector β has relatively few nonzero coefficients. In contrast, the solution of L₂-regularized least squares (i.e. ridge regression) typically has all coefficients nonzero. Because it effectively reduces the number of variables, the Lasso is useful in some contexts.

For over-determined systems (more instances than variables, commonly in machine learning), we normalize variables with mean 0 and standard deviation 1. For under-determined systems (fewer instances than variables, e.g. compressed sensing), we assume white noise (i.e. no intercept in the linear model) and do not perform normalization. Note that the solution is not unique in this case.

There is no analytic formula or expression for the optimal solution to the L₁-regularized least squares problems. Therefore, its solution must be computed numerically. The objective function in the L₁-regularized least squares is convex but not differentiable, so solving it is more of a computational challenge than solving the L₂-regularized least squares. The Lasso may be solved using quadratic programming or more general convex optimization methods, as well as by specific algorithms such as the least angle regression algorithm.

References

R. Tibshirani. Regression shrinkage and selection via the lasso. J. Royal. Statist. Soc B., 58(1):267-288, 1996.
B. Efron, I. Johnstone, T. Hastie, and R. Tibshirani. Least angle regression. Annals of Statistics, 2003
Seung-Jean Kim, K. Koh, M. Lustig, Stephen Boyd, and Dimitry Gorinevsky. An Interior-Point Method for Large-Scale L1-Regularized Least Squares. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 1, NO. 4, 2007.

Constructor Summary

Constructors

Constructor

Description

LASSO()
Method Summary

Modifier and Type

Method

Description

static LinearModel

fit(Formula formula, DataFrame data)

Fits a L1-regularized least squares model.

static LinearModel

fit(Formula formula, DataFrame data, double lambda)

Fits a L1-regularized least squares model.

static LinearModel

fit(Formula formula, DataFrame data, double lambda, double tol, int maxIter)

Fits a L1-regularized least squares model.

static LinearModel

fit(Formula formula, DataFrame data, Properties params)

Fits a L1-regularized least squares model.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- LASSO
  
  public LASSO()
Method Details
- fit
  
  public static LinearModel fit(Formula formula, DataFrame data)
  
  Fits a L1-regularized least squares model.
  
  Parameters:
  
  formula - a symbolic description of the model to be fitted.
  
  data - the data frame of the explanatory and response variables. NO NEED to include a constant column of 1s for bias.
  
  Returns:
  
  the model.
- fit
  
  public static LinearModel fit(Formula formula, DataFrame data, Properties params)
  Fits a L1-regularized least squares model. The hyperparameters in prop include
  
  smile.lasso.lambda is the shrinkage/regularization parameter. Large lambda means more shrinkage. Choosing an appropriate value of lambda is important, and also difficult.
  smile.lasso.tolerance is the tolerance for stopping iterations (relative target duality gap).
  smile.lasso.iterations is the maximum number of IPM (Newton) iterations.
  Parameters:
  
  formula - a symbolic description of the model to be fitted.
  
  data - the data frame of the explanatory and response variables. NO NEED to include a constant column of 1s for bias.
  
  params - the hyperparameters.
  
  Returns:
  
  the model.
- fit
  
  public static LinearModel fit(Formula formula, DataFrame data, double lambda)
  
  Fits a L1-regularized least squares model.
  
  Parameters:
  
  formula - a symbolic description of the model to be fitted.
  
  data - the data frame of the explanatory and response variables. NO NEED to include a constant column of 1s for bias.
  
  lambda - the shrinkage/regularization parameter.
  
  Returns:
  
  the model.
- fit
  
  public static LinearModel fit(Formula formula, DataFrame data, double lambda, double tol, int maxIter)
  
  Fits a L1-regularized least squares model.
  
  Parameters:
  
  formula - a symbolic description of the model to be fitted.
  
  data - the data frame of the explanatory and response variables. NO NEED to include a constant column of 1s for bias.
  
  lambda - the shrinkage/regularization parameter.
  
  tol - the tolerance to stop iterations (relative target duality gap).
  
  maxIter - the maximum number of IPM (Newton) iterations.
  
  Returns:
  
  the model.

Class LASSO

References

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

LASSO

Method Details

fit

fit

fit

fit