Scoring Risk of Default Using Banking Transaction Data (DSC Capstone Project)
- Developed innovative cash score model for assessing credit risk of first-time loan and credit card applicants.
- Led data analysis, income estimation, and feature derivation processes, ensuring robust risk assessment.
- Achieved 84% accuracy and 0.87 AUC with XGBoost model, identifying top 40% risky borrowers with < 8% default rate.
- Provided actionable insights on top three default risk factors, contributing to enhanced lending decisions and inclusive practices.
Sudoku Solver
- Used backtracking algorithm to solve the Sudoku puzzle.
- Skills: JavaScript, HTML, CSS
Status and Prospects of Data Science Careers
- Data visualization project focusing on data science job trends and salary growth.
- Target audience: individuals interested in data science careers, including data science students.
- Utilizes standard visualization techniques, follows Drill-Down narrative structure with overview and specific aspects.
- Visualizations cover remote work trends, salary growth with experience, geographic factors, and job categories.
Predictive Analysis on Clothing Fit
- Developed a predictive model for clothing fit based on user measurements using Python.
- Conducted exploratory analysis to uncover insights on size distribution.
- Implemented baseline and enhanced models using scikit-learn and pandas, achieving high accuracy.
- Balanced the dataset to address class imbalance and improve model performance.
Rank Prediction of New York City Police Officers based on Civilian Complaints
- Developed a predictive model using DecisionTreeClassifier to rank
New York City police officers based on civilian complaints.
- Improved accuracy significantly from 0.13 to 0.34 through feature
engineering and model optimization.
- Conducted fairness analysis and performed a permutation test to
assess potential biases in the model's performance.