Home | Command Reference | Run Example

Ridge Regression Command

The RIDGE_REGRESSION command estimates a linear regression model with an L2 penalty. Ridge regression is useful when explanatory variables are highly correlated, when ordinary least squares coefficients are unstable, or when the goal is to reduce overfitting.

The command compares the ridge coefficients with the ordinary least squares coefficients and reports model diagnostics such as R-squared, adjusted R-squared, RMSE, MAE, condition number, and Durbin-Watson. It also creates a ridge trace chart showing how each coefficient changes as the penalty parameter lambda increases.

Example

RIDGE_REGRESSION(LOGDIFF(LIST(UMCSENT,U6RATE,UNRATE,TCU,TTLCONS)))

This example estimates a ridge regression model using the log differences of consumer sentiment, U-6 unemployment, unemployment, capacity utilization, and construction spending. The first series in the list is treated as the dependent variable. The remaining series are used as explanatory variables.

Syntax

RIDGE_REGRESSION(seriesList)

The input must be a DataSeriesList. The first series is the dependent variable, and all remaining series are independent variables.

Lambda Selection

The command searches across a fixed grid of lambda values:

0.0001, 0.001, 0.01, 0.1, 1.0, 10.0, 100.0, 1000.0

For each lambda, the model is re-estimated. The selected model is the one with the lowest RMSE. A larger lambda applies more shrinkage to the coefficients, while a smaller lambda behaves more like ordinary least squares.

Output

Interpretation

Ridge regression stabilizes coefficient estimates by penalizing large coefficients. This is especially helpful when economic or financial series move together and ordinary regression assigns unstable weights to related variables.

The ridge trace chart is often the most useful diagnostic. If coefficients change sharply at very small lambda values, the original regression may be unstable. If coefficients become smoother as lambda rises, ridge regression is reducing sensitivity to multicollinearity.

The condition number helps identify whether the explanatory variables are close to linearly dependent. A high condition number suggests that ordinary least squares coefficients may be unreliable.

Typical Use Cases