Machine Learning: M-Pesa | Shahzad Panthaki

Predicting the Factors and Likelihood of Mobile Money (M-PESA) Adoption in Kenya

Executive Summary

This is a Machine Learning project undertaken to predict the likelihood of mobile money adoption in Kenya using real-world survey data from 2,282 households. The project involved understanding classifiers, performing Linear Discriminant Analysis, building a decision tree, and implementing various machine learning algorithms such as Logistic Regression, Random Forest, and K-Nearest Neighbors (KNN). Key findings include the significance of economic indicators like per capita consumption and total wealth in predicting mobile money adoption. The Logistic Regression model emerged as the best classifier, offering high interpretability and robust classification performance. Based on the analysis, recommendations include expanding mobile infrastructure and financial education to boost mobile money adoption.

Business Problem

In Kenya, mobile money services like M-PESA are vital for financial inclusion, especially among populations without access to traditional banking. The government seeks to increase mobile money adoption to enhance financial inclusion. The objective is to identify key factors influencing mobile money adoption and provide actionable recommendations to the government for targeted policy design.

Methodologies

Literature Review: Analyzed existing research on mobile money adoption.
Data Collection: Used survey data from 2,282 Kenyan households.
Descriptive Statistics: Presented summary statistics for key variables.
Model Building: Logistic Regression, Random Forest, Linear Discriminant Analysis, K-Nearest Neighbors
Model Evaluation: Compared models using accuracy and Area Under the Curve (AUC) metrics.
Feature Importance: Assessed feature importance using the Random Forest model.

Skills

Programming Languages: Python
Libraries and Tools: pandas, scikit-learn, matplotlib, seaborn
Statistical Techniques: Logistic Regression, Random Forest, Linear Discriminant Analysis, K-Nearest Neighbors
Data Processing: StandardScaler for data normalization
Visualization: Feature importance plots, decision tree diagrams
Model Evaluation: Accuracy, AUC, cross-validation

Results & Business Recommendation

Key Findings & Reccomendations

Mobile money adoption in Kenya is strongly influenced by factors like per capita consumption,total wealth, and education years of the household head.
Households owning cell phones are more likely to use mobile money, highlighting the importance of mobile phone accessibility.
The Logistic Regression model, which provides insights into feature importance, is the best classifier for predicting mobile money adoption with an accuracy of approximately (81.46%).
Regularization in the logistic model highlighted the significance of certain features, ensuring a balance between model complexity and performance.
Further expanding mobile infrastructure and financial education could potentially boost M-PESA adoption rates in underrepresented areas and demographics.

Model Performance

Logistic Regression: Accuracy = 81.46%, AUC = 0.8587
Random Forest: Accuracy = 81.89%, AUC = 0.8409
Linear Discriminant Analysis: Accuracy = 80.13%, AUC = 0.8527
KNN: Best accuracy with K=7, average accuracy = 82.85%

Next Steps

Data Enrichment: Collect more detailed data on user demographics and behavioral patterns.
Model Enhancement: Explore more advanced machine learning models and ensemble techniques.
Policy Simulation: Use predictive models to simulate the impact of various policy interventions on mobile money adoption rates.
Longitudinal Study: Conduct a longitudinal study to track changes in mobile money adoption over time.