Skip to contents

A dataset containing the loan attributes of 1,000 fake loans. This dataset is intended to be well-suited for building classification models, including logistic regression models that can be converted into scorecards.

Usage

loans

Format

A data frame with 1,000 rows and 11 variables:

loan_id

identifier representing a unique loan

amount_of_existing_debt

interval representing the amount of existing debt the customer of the loan has outstanding, in dollars

term

original length of the loan term, in months

industry

primary industry farmed of the primary customer on the loan

loan_amount

original amount of the loan, in dollars

other_debtors_guarantors

status of the customer on the loan as a "co-applicant", "guarantor", or "none"

years_at_current_address

length of time customer has lived at their current address, in years

collateral_type

type of collateral used to back the loan

housing_status

whether the primary customer on the loan owns or rents their residential address

count_loan_facilities

count of the number of loan facilities the customer associated with the loan has with the institution

default_status

binary "good"/"bad" classification of the loan's default status (i.e., the dependent variable)