A spin on the famous GermanCredit machine learning dataset.
loans.Rd
A dataset containing the loan attributes of 1,000 fake loans. This dataset is intended to be well-suited for building classification models, including logistic regression models that can be converted into scorecards.
Format
A data frame with 1,000 rows and 11 variables:
- loan_id
identifier representing a unique loan
- amount_of_existing_debt
interval representing the amount of existing debt the customer of the loan has outstanding, in dollars
- term
original length of the loan term, in months
- industry
primary industry farmed of the primary customer on the loan
- loan_amount
original amount of the loan, in dollars
- other_debtors_guarantors
status of the customer on the loan as a "co-applicant", "guarantor", or "none"
- years_at_current_address
length of time customer has lived at their current address, in years
- collateral_type
type of collateral used to back the loan
- housing_status
whether the primary customer on the loan owns or rents their residential address
- count_loan_facilities
count of the number of loan facilities the customer associated with the loan has with the institution
- default_status
binary "good"/"bad" classification of the loan's default status (i.e., the dependent variable)