
Calculate the Scorecard Points for a Class in an Independent Variable
points.Rdpoints() calculates the number of scorecard points for a
unique class in an independent variable using Weights of Evidence, helping
to build a full "scorecard" of a number of points mapped to each class
in each independent variable.
Arguments
- woe
(Numeric) The Weight-of-Evidence value for a given class of the independent variable
- estimate
(Numeric) The coefficient of the logistic regression model for the independent variable
- intercept
(Numeric) The intercept value of the logistic regression model (where model is trained to predict the probability of "bad")
- num_vars
(Integer) The number of independent variables in the logistic regression model
- tgt_points
(Integer) The target number of points to be used in conjunction with the target odds; see the Details section of
?oddsfor more information- tgt_odds
(Numeric) The odds that the
tgt_pointsshould have; see the Details section of?oddsfor more information- pxo
(Integer) The number of points to 'double' the odds; see the Details section of
?oddsfor more information- rate
(Numeric) The value to exponentially increase the odds by for the given number of points supplied in the
pxoargument; see the Details section of?oddsfor more information- round
(Integer) The number of digits to round the output score to; default is to round to the nearest integer
Value
A numeric value representing the number of scorecard points for the given class in the independent variable
Details
The tgt_points and tgt_odds arguments work together to build a "baseline"
points/odds for your scorecard. For example, a tgt_points value of 300 and
a tgt_odds value of 30 would be interpreted as a score of 300 points would
have 30:1 odds of default. See Details section of ?odds for more
information.
The intercept value must be from a glm model trained to predict
probability of "bad". This means that the order of levels in the dependent
variable must be c("good", "bad"). See ?binomial for more details about
the order in which glm(family = "binomial") expects the levels of the
dependent variable.
References
Siddiqi, Naeem (2017). Intelligent Credit Scoring: Building and Implementing Better Credit Risk Scorecards. 2nd ed., Wiley. pp. 240-242.
Examples
# Pre-process the data to create WoE features
df <- woe(
data = loans |>
dplyr::mutate(
default_status = factor(loans$default_status, levels = c("good", "bad"))
),
outcome = default_status,
predictors = c(industry, housing_status),
method = "replace",
verbose = FALSE
)
# Fit the logistic regression model
fit <- glm(default_status ~ ., data = df, family = "binomial")
# Extract the model's parameter estimates & intercept
params <- fit$coefficients |>
tibble::as_tibble(rownames = NA) |>
tibble::rownames_to_column(var = "variable")
# Build the scorecard base
card <- woe(
data = loans,
outcome = default_status,
predictors = c(industry, housing_status),
method = "dict",
verbose = FALSE
) |>
dplyr::transmute(
variable = paste0("woe_", variable),
class = class,
woe = woe
) |>
dplyr::inner_join(params, by = "variable")
# Add the points
card |>
dplyr::mutate(
points = points(
woe = woe,
estimate = value,
intercept = params$value[params$variable == "(Intercept)"],
num_vars = length(params$variable[params$variable != "(Intercept)"]),
tgt_points = 300L,
tgt_odds = 30,
pxo = 20L,
rate = 2
)
)
#> # A tibble: 12 × 5
#> variable class woe value points
#> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 woe_industry "" -1.23 -0.991 78
#> 2 woe_industry "beef" 0.231 -0.991 120
#> 3 woe_industry "dairy" 0.0956 -0.991 116
#> 4 woe_industry "fruit" 0.359 -0.991 123
#> 5 woe_industry "grain" -0.410 -0.991 101
#> 6 woe_industry "greenhouse" 0.511 -0.991 128
#> 7 woe_industry "nuts" 0.288 -0.991 121
#> 8 woe_industry "pork" 0.606 -0.991 130
#> 9 woe_industry "poultry" -0.774 -0.991 91
#> 10 woe_industry "sod" 0.154 -0.991 118
#> 11 woe_housing_status "own" -0.194 -0.999 108
#> 12 woe_housing_status "rent" 0.430 -0.999 126
# Calculate points manually
points(
woe = 1.23,
estimate = -0.991,
intercept = -0.846,
num_vars = 2L,
tgt_points = 300L,
tgt_odds = 30,
pxo = 20L,
rate = 2
)
#> [1] 148