K-Nearest Neighbor Machine Learning algorithm
The German credit dataset can be downloaded from UC Irvine, Machine learning community to indicate the predicted outcome if the loan applicant defaulted or not. Applying the logistic regression with three variables duration, amount, and installment, K-means classification, and K-Nearest Neighbor machine learning algorithm.
# Logistic regression
# Load the file from the hard disk after setting the work directory
germandata - read.csv("Creditdata.csv")
# Print dataset to see the pattern of the data
germandata
# The variable response is leveraged to evaluate the probability of the default outcome of the credit loan
germandata$Response - factor(germandata$Response)
# The subset of the data has been created to leverage the variables duration, amount, installment, and response
germandata - germandata[,c("duration","amount","installment","Response")]
# Print the dataset to see the data for these variables
germandata
#Perform the summary function on the dataset to see the data
summary(germandata)
#Sample output for 10 rows:
> germandata
duration amount installment Response
1 6 1169 A143 1
2 48 5951 A143 2
3 12 2096 A143 1
4 42 7882 A143 1
5 24 4870 A143 2
6 36 9055 A143 1
7 24 2835 A143 1
8 36 6948 A143 1
9 12 3059 A143 1
10 30 5234 A143 2
11 12 1295 A143 2