Model Fit Graphs
Three new graphs have been added to assist in visualizing the model fit:
- Fit by Model Decile
- Error Dispersion
- Lift Chart.
A “View Model Fit Graphs” button will be displayed in the Status of Analysis panel once an analysis has completed. These graphs are automatically included in the Print Report as well.
Fit by Model Decile
This graph shows how well the modeled results fit to the actual results, and can be viewed for Test Data or Training Data. The detailed data (actual target, modeled target, modeled target with curve modification) is displayed sorted in order of the modeled results. Each observation in the Test (or Training) dataset is represented by a small dot. The small red dot is the actual value, the small blue dot is the modeled value, and the small green dot is modeled with curve modification. The blue and green dots will generally appear to be lines, as a result of being sorted in order. The red circles reflect the average value of the actual target for each decile. The blue and green lines show the amount of differentiation that the model provides.
Comparing the red circles to the blue and green lines gives a sense of how well the model performs on cohorts of similar modeled results (particularly when showing Test data). The dispersion of red dots around the blue and green lines gives a sense of the idiosyncratic variability around the modeled results. The vertical scale will not necessarily extend all the way to include the highest red dot (actual target). This enables the graph to show model differentiation and cohort accuracy without being dwarfed by idiosyncratic results of highly-skewed data.
This graph shows the distribution of the residual errors (difference between actual and modeled results), either for Test or Training data, and can be displayed with or without curve modification.
The lift chart illustrates how well your model does at ordering the records. The term “Relative Net Lift” refers to the improvement that the model has in correctly ordering the records, between the extremes of ordering the results relative to exposure and ordering them perfectly.
The highest possible relative net lift is 1 -- if the model succeeds in perfectly ordering the records from from highest to lowest actual Target value.
A relative net lift of 0 means that the model does no better at sorting the records than would be achieved by simply using exposure (or if no exposure was used, than by a random sort).
It is possible to have negative relative net lift, meaning that your model is worse at ordering the target than just using the exposure field as the only predictive characteristic (or random in the case of no exposure field).