Multiple stepwise regression in IDL - a code sample

Regressions belong to a group of well known statistical tools that allow to calculate the relationship between two or more variables. For multivariate analysis stepwise regression is implemented thereby selecting one independant variable after the other and testing each time if the regression result is still statistical significant. Regressions are an extremely useful tool and almost any spread sheet program contains a routine for regression analysis.
In IDL a stepwise procedure called stepwise was implemented in older versions but was declared obsolete together with the regress routine around IDL 5.1 - however without being substituted by a newer version. When I needed stepwise regression to estimate the relationship between topographic variables and precipitation in my regionalisation procedure REGEOTOP I searched discussion forums and the internet for an IDL stepwise regression routine. Given the importance of such a routine I was shure to find several code samples but to my amazement I found not a single reference!
So I decided to write one myself, keeping pretty close to the parameter structure of the old IDL routine to allow an easy transition for those who used the old routine. Tested against the multiple steppwise regression procedure in SPSS10 with more than 20 independant variables the results are virtual identical (within the limits of the machine precision). Over the last years a lot of colleagues from around the world asked for the code so I decided to set up this page for easy retreaval of the program.

description

my_stepwise calculates a multiple linear stepwise regression. In a first step single regressions of all independant variables against the dependant are calculated with the IDL routine regression. After being sorted according to their declared variance independant variables are incorporated into the regression equation in decreasing order (that is beginning with the variable with the highest explained variance). Variables with insignificant regressions are discarded. For each variable introduced into the equation the significance is tested by a t-test of the regression coefficient. Variables with significances < 5% are rejected.

Below I have added the program code which can be copied and used for private or scientific, non-profit purposes. The comments are somewhat provisional translated from German so please feel free to ask in case of doubt...

disclaimer

Please note that the software is distributed as is without any warranty.
You are free to distribute or change the code as long as you do so with a proper citation. Please cite as: my_stepwise, IDL program code for multivariate linear stepwise regression written by Axel Thomas, Institute of Geography, Mainz University, Germany. If you have comments or found a bug please let me know!

Download IDL source code for stepwise multivariate regression

If you have comments or questions: a.thomas@geo.uni-mainz.de