Data mining is a procedure to find relationships among a group of variables in a database. SAFE TOOLBOXES^{®} comes with two data mining models: forward stepwise regression and backward stepwise regression. Both methods are based on running multivariate linear regression models multiple times, but they differ in how the variables are included or excluded.
The forward stepwise regression starts from an empty model, allowing variables to be added and removed. The backward stepwise regression starts from a complete model, allowing variables to be added and removed. The result of the process is a multivariate linear regression that supposedly contains the “best” set of explanatory variables.
Now, let’s illustrate this tool with an example.
Suppose that you have the following database:
A 
B 
C 
D 
E 
F 

1 
Y 
X1 
X2 
X3 
X4 
X5 
2 
80.9 
21.3 
79.3 
92.1 
96.0 
191.8 
3 
46.3 
11.9 
46.8 
135.8 
100.5 
89.9 
4 
61.2 
23.6 
62.1 
101.3 
83.7 
241.8 
5 
61.5 
25.7 
62.8 
91.9 
80.9 
231.7 
6 
78.0 
11.4 
77.4 
78.1 
68.6 
325.9 
7 
58.6 
5.5 
57.6 
104.0 
65.7 
213.1 
8 
95.2 
12.2 
97.4 
73.8 
72.4 
136.7 
9 
47.0 
21.3 
47.7 
148.5 
87.2 
77.1 
10 
95.3 
25.3 
92.7 
63.3 
86.6 
187.3 
11 
48.8 
8.5 
50.3 
102.2 
79.1 
197.0 
12 
74.8 
14.8 
74.1 
86.3 
87.0 
391.9 
13 
57.7 
10.2 
57.0 
141.0 
85.6 
207.2 
14 
59.8 
17.0 
57.6 
70.8 
70.9 
169.3 
... 
... 
... 
... 
... 
... 
... 
288 
59.2 
0.7 
60.7 
99.3 
44.9 
240.0 
289 
65.2 
13.8 
65.8 
118.7 
90.0 
198.9 
290 
62.0 
16.4 
64.6 
143.7 
59.6 
187.7 
291 
92.8 
26.7 
91.7 
115.8 
62.9 
230.1 
292 
88.9 
3.6 
88.5 
73.7 
100.8 
182.2 
293 
63.7 
1.6 
62.7 
27.4 
89.1 
78.3 
294 
73.3 
18.7 
72.3 
121.3 
71.1 
92.6 
295 
37.7 
7.0 
38.0 
82.0 
90.1 
135.8 
296 
56.3 
16.5 
57.9 
92.0 
101.0 
86.2 
297 
97.9 
3.5 
97.4 
66.5 
94.1 
203.6 
298 
51.1 
8.5 
51.5 
61.0 
79.9 
249.9 
299 
37.2 
27.3 
37.7 
87.4 
76.7 
218.4 
300 
60.5 
21.7 
60.4 
167.9 
105.6 
214.5 
301 
44.9 
16.6 
44.3 
68.2 
98.8 
178.2 
302 






If you want, for instance, to run a forward stepwise regression, follow these steps:
Examining the final equation model of the regression in tab , we will find: