Credibility

 

Cognalysis MultiRate can apply credibility weighting to individual bins for a Characteristic, and also neighboring cells of data or neighboring ZIP Codes, County FIPS Codes, State FIPS Codes, or Custom Geographic Territories for geographical data.

 

Application of Credibility:

 

Smoothing

Credibility

No smoothing

The credibility factor will be applied to each bin of data individually. The impact of neighboring bins is not incorporated.

Variable Gradient

The exposure used in the credibility formula will include exposure from nearby bins.

Linear on Average

The credibility factor will be applied to the slope parameter, based on the total amount of exposure for the analysis.

Linear on Bin Number

The credibility factor will be applied to the slope parameter, based on the total amount of exposure for the analysis.

Linear on Log of Average

The credibility factor will be applied to the slope parameter, based on the total amount of exposure for the analysis.

Variable Gradient Geospatial Smoothing

The exposure used in the credibility formula will include exposure from nearby ZIP Codes, County FIPS Codes, State FIPS Codes, or Custom Geographic Territories.

 

Credibility is applied prior to Blending and Balancing.

 

 

There are two credibility methods that can be used in MultiRate: Exposure and t-Statistic. Before running an analysis, select the one to be used at the bottom of the Data and Analysis Control, below the Characteristic Fields:

 

 

Exposure Method

 

The Exposure Method assigns credibility based on the exposure amount and the selected 50% Credibility Level. As can be seen below, an exposure amount equal to the 50% Credibility Level would result in 50% credibility.

 

The formula used by Cognalysis MultiRate when this option is selected is:

 

 

 

t-Statistic Method

 

The t-Statistic Method assigns credibility based on the consistency of observed relationships and the selected Confidence Level, and is calculated as follows:

 

First, all of the records are sorted by the ratio of Target to Modified Exposure, smallest to largest. Records that take the same value for their Target (many values at zero, for example) are partitioned. Then, the partitions are assigned Z-scores from a standard normal distribution based on the amount of exposure in each partition, using the formula:

 

          

 

          where  represents the CDF of the standard normal distribution.

 

Once these partitions have been calculated, the average values of  and  over the partition are calculated using the formulas below, which can be derived by integration over the standard normal CDF1:

 

 

 

All records within a common partition (i.e. the same ratio of Target to Modified Exposure) are assigned these partition values of  and .

 

Next these values are averaged across records at the level at which a factor is being calculated. In the case where smoothing is not being applied, this means all records within the bin are averaged, weighted by exposure. In the case where smoothing is being applied, all records being used to calculate the factor are averaged, weighting by both the Exposure and the smoothing weights2. The calculated values of  and  corresponding to a specific bin are then used to calculate the standard deviation and standard error of  for the bin, and finally a t-statistic. The formulas for these calculations are as follows:

 

 

 

          where  is the quasi-count of the number of observations in the sample, defined as:

 

                    

 

Next, a critical t value, , is calculated from an inverse t distribution, using the selected Confidence Level and degrees of freedom equal to the quasi-count minus 1:

 

 

The credibility is then calculated as the percentage by which the the absolute value of the t-statistic exceeds the critical t, as seen below:

 

 

Finally, if the observed relationship and the z score have the same direction (i.e. z<0, raw factor<1 or z>0 and raw factor>1), the credibility above is used. However, if they have opposite direction, a credibility of 0 is assigned.

 

1 For the first and last partitions,  and  have not been defined above. For the first partition, , which will result in the first exponential term going to 0, and for the last partition, , which will result in the second exponential term going to 0.

 

2 In cases where smoothing is being applied, the formula for the averages, weighted by both Exposure and the smoothing weights, is:

 

 

          For , simply replace  with .

 

In the example below, the average for bin 1 is being calculated, and the Variable Gradient Radius used to calculate the smoothing weights is 1.

 

Key ID

Bin

 

 

Exposure

Smoothing Weight

1

2

-0.5

0.3

1

0.5

2

3

0

0.1

2

0.25

3

1

0.2

0.2

3

1

4

2

0.7

0.6

4

0.5

5

1

1.3

2.1

5

1