Skip to content

Latest commit

 

History

History
39 lines (30 loc) · 2.75 KB

File metadata and controls

39 lines (30 loc) · 2.75 KB

Employee Retention Analysis

Project Title

Predictive Modeling of Employee Retention using HR Data

Project Overview

This project aims to analyze employee retention within a company using a dataset containing various employee-related features. The primary objective is to build predictive models that can identify potential attrition risks. The analysis includes exploratory data analysis (EDA), statistical hypothesis testing, and the implementation of machine learning models. The resulting insights and models provide guidance for improving employee retention strategies.

Skills and Libraries

Skills:

  • Exploratory Data Analysis (EDA)
  • Statistical Hypothesis Testing
  • Machine Learning Modeling
  • Hyperparameter Tuning
  • Model Evaluation and Interpretation
  • Business Problem Solving

Libraries:

  • Pandas
  • NumPy
  • Seaborn
  • Matplotlib
  • Scikit-learn
  • SciPy
  • XGBoost

Business Understanding

Understanding and retaining valuable employees is crucial for organizational success. High turnover rates can lead to increased costs, loss of institutional knowledge, and decreased overall productivity. This project addresses the business problem of identifying factors influencing employee retention, providing actionable insights for HR and management.

Data Understanding

The analysis utilizes a dataset HR_comma_sep.csv from Kaggle, which includes information about employee satisfaction, project involvement, working hours, tenure, salary, and department. The timeframe of the data and any data limitations are considered during the analysis. Exploratory data analysis (EDA) visualizations provide insights into the relationships between different variables and their impact on employee retention.

Modeling and Evaluation

The project employs machine learning models, including Logistic Regression, Random Forest Classification, and XGBoost, to predict employee retention. Hyperparameter tuning using GridSearchCV enhances model performance. Evaluation metrics such as precision, recall, F1-score, accuracy, and AUC are computed to assess model effectiveness.

Conclusion

The analysis reveals key insights into factors influencing employee retention, allowing for targeted recommendations. Recommendations include addressing workload concerns, improving job satisfaction, and investigating salary disparities. Future steps may involve continuous monitoring, refining models, and implementing targeted interventions based on evolving workforce dynamics.

Feel free to explore the Jupyter Notebook for a detailed walkthrough of the analysis, visualizations, and model implementations.