The aim of the project is to identify risky loan applicants using EDA, then such loans can be reduced thereby cutting down the amount of credit loss.
- You work for a consumer finance company which specialises in lending various types of loans to urban customers. When the company receives a loan application, the company has to make a decision for loan approval based on the applicant’s profile.
- This company is the largest online loan marketplace, facilitating personal loans, business loans, and financing of medical procedures. Borrowers can easily access lower interest rate loans through a fast online interface.
- Like most other lending companies, lending loans to ‘risky’ applicants is the largest source of financial loss (called credit loss). Credit loss is the amount of money lost by the lender when the borrower refuses to pay or runs away with the money owed. In other words, borrowers who default cause the largest amount of loss to the lenders. In this case, the customers labelled as 'charged-off' are the 'defaulters'.
- The dataset used is a complete loan data for all loans issued through the time period 2007 to 2011.
- Using bivariate analysis on grade and average default rate we found that consumer with grade A and B are the lowest risk borrowers and as the grade goes from C to G the riks goes on increasing.
- Consumers with the puropose of small_business are the high risk borrowers.
- Consumers with high rate of interest are likely to default.
- Surprisingly consumers with verification_status as Verified are defaulting the most.
- Consumers with home_ownership as OTHER are the high risk borrowers.
- Python - version 3.11.7
- Numpy - version 1.26.4
- Pandas - version 2.1.4
- Matplotlib - version 3.8.0
- Seaborn - version 0.12.2
Give credit here.
- This project was done in collabration with Sandip Joshi.
- This project was based on learnings from updGrad course on EDA.
Created by [@Saket89] - feel free to contact me!