File size: 7,761 Bytes
e93c178
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
import streamlit as st
import pandas as pd
import numpy as np

from src.data_loader import load_data
from src.preprocessing import feature_engineering, encode_data
from src.model import train_model
from src.predict import make_prediction
from src.utils import calculate_emi

# -----------------------------------------------------------------------------
# PAGE CONFIG
# -----------------------------------------------------------------------------
st.set_page_config(page_title="Loan Prediction System", page_icon="🏦", layout="centered")

# -----------------------------------------------------------------------------
# NAVIGATION STATE
# -----------------------------------------------------------------------------
if 'page' not in st.session_state:
    st.session_state['page'] = 'Home'

c1, c2, c3, c4, c5 = st.columns([2,1,1,1,2])
with c2:
    if st.button("Home"): st.session_state['page'] = 'Home'
with c3:
    if st.button("Predict"): st.session_state['page'] = 'Predict'
with c4:
    if st.button("About"): st.session_state['page'] = 'About'

st.markdown("---")

# -----------------------------------------------------------------------------
# LOAD + TRAIN
# -----------------------------------------------------------------------------
DATA_PATH = "data/train.csv"

df = load_data(DATA_PATH)

if df is not None:
    df = feature_engineering(df)
    df, encoders, target_encoder = encode_data(df)

    X = df.drop("Loan_Status", axis=1)
    y = df["Loan_Status"]

    feature_columns = X.columns.tolist()
    model = train_model(X, y)
else:
    st.error("Dataset not found")

# -----------------------------------------------------------------------------
# HOME PAGE
# -----------------------------------------------------------------------------
if st.session_state['page'] == "Home":
    st.title("Loan Approval Prediction System")
    st.subheader("Using Machine Learning")

    st.markdown("### About This Application")
    st.info("""
    Welcome to the next generation of banking technology. This application utilizes advanced 
    **Machine Learning** algorithms to automate the loan eligibility assessment process. 
    By analyzing key financial indicators—such as **Income, Credit History, and Loan Term**—our 
    system provides an instant, objective, and data-driven prediction.
    """)

    st.markdown("### Why Use This System?")
    col1, col2 = st.columns(2)

    with col1:
        st.success("⚡ **Real-Time Analysis**\n\nGet instant results without manual verification.")
        st.success("🛡️ **Financial Guardrails**\n\nDetects risky applications automatically.")

    with col2:
        st.success("🎯 **High Accuracy**\n\nUses Random Forest model for reliable prediction.")
        st.success("💡 **Smart Suggestions**\n\nGives tips to improve approval chances.")

    st.markdown("---")

    st.markdown("""
    ### How It Works
    1. Click Predict
    2. Fill details
    3. Click Predict Status
    4. Get result instantly
    """)

# -----------------------------------------------------------------------------
# PREDICT PAGE
# -----------------------------------------------------------------------------
elif st.session_state['page'] == "Predict":
    st.title("📋 Loan Application Form")

    if df is None:
        st.error("Dataset not found")
    else:
        with st.form("prediction_form"):
            st.subheader("Applicant Details")

            c1, c2 = st.columns(2)

            with c1:
                gender = st.selectbox("Gender", ["Male", "Female"])
                married = st.selectbox("Married", ["No", "Yes"])
                dependents = st.selectbox("Dependents", ["0", "1", "2", "3+"])
                education = st.selectbox("Education", ["Graduate", "Not Graduate"])
                self_employed = st.selectbox("Self Employed", ["No", "Yes"])

            with c2:
                applicant_income = st.number_input("Applicant Income (Monthly ₹)", value=5000)
                coapplicant_income = st.number_input("Co-Applicant Income (Monthly ₹)", value=0)
                loan_amount = st.number_input("Loan Amount (₹)", value=100000)
                loan_term_years = st.number_input("Loan Term (Years)", value=15)
                property_area = st.selectbox("Property Area", ["Urban", "Semiurban", "Rural"])
                cibil_score = st.number_input("CIBIL Score", 300, 900, 750)

            submit = st.form_submit_button("Predict Status")

        if submit:
            loan_amt_k = loan_amount / 1000
            loan_term_m = loan_term_years * 12
            total_income = applicant_income + coapplicant_income

            model_emi = (loan_amt_k * 1000) / loan_term_m
            balance_income = total_income - model_emi

            credit_history = 1.0 if cibil_score >= 600 else 0.0

            input_data = {
                "Gender": gender,
                "Married": married,
                "Dependents": dependents,
                "Education": education,
                "Self_Employed": self_employed,
                "ApplicantIncome": applicant_income,
                "CoapplicantIncome": coapplicant_income,
                "LoanAmount": loan_amt_k,
                "Loan_Amount_Term": loan_term_m,
                "Credit_History": credit_history,
                "Property_Area": property_area,
                "Total_Income": total_income,
                "EMI": model_emi,
                "Balance_Income": balance_income
            }

            result, confidence = make_prediction(
                input_data, model, encoders, target_encoder, feature_columns
            )

            st.markdown("### Result")

            if result == "Y":
                st.success(f"✅ Approved ({confidence:.2f}%)")
            else:
                st.error(f"❌ Rejected ({confidence:.2f}%)")

# -----------------------------------------------------------------------------
# ABOUT PAGE
# -----------------------------------------------------------------------------
elif st.session_state['page'] == "About":
    st.title("About the Project")

    # 1. PROBLEM
    st.error("""
    **The Problem: Manual Underwriting**
    
    Traditionally, banks relied on manual verification processes which had major disadvantages:
    - **High Turnaround Time:** It took days or weeks to process a single application.
    - **Human Bias:** Decisions often varied from officer to officer.
    - **Static Rules:** Simple rules failed to see the bigger picture.
    """)

    st.write("")

    # 2. SOLUTION
    st.success("""
    **The Solution: Intelligent Automation**
    
    This project replaces the manual process with a **Hybrid Machine Learning Architecture**. 
    It combines strict financial logic (Guardrails) with AI pattern recognition (Random Forest) 
    to make safer, faster decisions.
    """)

    st.write("")

    # 3. WORKFLOW
    st.markdown("### 🔄 Project Workflow")
    st.info("""
    This system was built in **4 key stages**:
    
    1. **Data Analysis (Jupyter Notebook):** Data cleaning and preprocessing  
    2. **Model Training:** Random Forest model (~81% accuracy)  
    3. **Backend Logic:** Financial guardrails implementation  
    4. **Frontend:** Streamlit UI for user interaction  
    """)

    st.divider()

    # 4. TECH SPECS
    st.subheader("🛠️ Technical Architecture")

    c1, c2 = st.columns(2)

    with c1:
        st.markdown("**Machine Learning:**")
        st.caption("""
        - Algorithm: Random Forest Classifier  
        - Trees: 200 Estimators  
        - Accuracy: ~81%  
        """)

    with c2:
        st.markdown("**Tech Stack:**")
        st.caption("""
        - Python  
        - Streamlit  
        - Pandas  
        - Scikit-learn  
        """)