Retrieved from https://studentshare.org/statistics/1400730-data-mining
https://studentshare.org/statistics/1400730-data-mining.
As a way of solving classification issues and also decreases Type I errors, typical of many credit scoring models, this piece attempts to describe or rather come up with an appropriate credit scoring model via two stages. Classification stage involves development and construction of an ANN-based credit scoring model, which basically classifies applicants into two categories, which are, those who have acceptable credit (good) and those who have unacceptable credit (bad). In the second stage, which will also be referred to as the re-assigning stage, attempt is made to lower Type I error through reassignment of the unaccepted applicants with good credit to a conditionally accepted category making use of a CBR-based classification approach.
In a bid to demonstrate the effectiveness of the model proposed in this paper, an analysis is run on a German dataset with assistance of SAS Enterprise Miner. The results will be expected to not only prove that the model is a more effective credit scoring model but that it will also enhance the business revenues through its ability to lower both Type I and Type II error system scoring errors. Introduction Data mining is a process that involves search and analysis of data so as to find implicit, although substantially vital information.
It covers selection, exploration and modeling of large data volumes with the aim of uncovering previously unrecognized patterns, and in the end generate understandable information, from huge databases. It generally employs an extensive range of computational techniques which include approaches such as statistical analysis, decision trees analysis, neural networks review, rule induction and refinement approach, as well as graphic visualization. Of the various mentioned methods, the classification aspect has an important role in decision making within businesses mainly as a result of the extensive applications when it comes to financial forecasting, detection of fraud, development of a marketing strategy, credit scoring, to mention just but a few.
The aim of developing credit scoring models is to assist financial institutions to detect good credit applicants who are more likely to honor their debt obligation. Often such systems are based on multiple variables including the applicant’s age, their credit limit, income levels, as well as marital status, among others. Conventionally, there are many distinct credit scoring models which have been developed by financial as well as researchers in a bid to unravel the mysteries behind classification problem.
Such include linear discriminant analysis, logistic regression, multivariate adaptive regression splines, classification, as well as regression tree, case based reasoning, and of course the artificial neural networks. Normally, linear discriminant analysis, logistic regression, and artificial neural networks are utilized in construction of credit scoring models. LDA is amongst the earliest forms of credit scoring model and enjoy widespread usage across the globe. Nonetheless, its use has often been subjected to criticism based on its assumption of existence of a linear relationship between the input variables and the output variables.
Sadly, this is an assumption that seldom holds, and is rather sensitive to deviations arising from assumption of multivariate normality (West, 2000). Like LDA, LR is also a rather common alternative employed in performance of credit scoring assessments. In essence, the LR model has stood out as the best
...Download file to see next pages Read More