Welcome to the My Awesome EDA (Exploratory Data Analysis) Module! This Python module provides a set of tools for exploring and analyzing your dataset. Whether you're a data scientist, analyst, or enthusiast, this module will help you gain insights into your data quickly and efficiently.
- Welcome Gif: A fun welcome gif to kick off your exploration.
- Basic Dataset Information: Quickly get an overview of the number of observations (rows) and parameters (columns) in your dataset.
- Data Type Summary: Understand the data types of each column in your dataset.
- Categorization of Features: Categorize features into numerical, string, and categorical based on unique threshold.
- Summary Statistics: Get descriptive statistics for numerical features, including mean, standard deviation, minimum, 25th percentile, median, 75th percentile, and maximum values.
- Outliers Detection: Identify outliers in numerical features using the interquartile range (IQR) method.
- Missing Values Analysis: Investigate missing values in your dataset, including total missing values, rows with missing values, and columns with missing values.
- Duplicate Rows Detection: Identify duplicate rows in your dataset.
- Visualizations: Generate informative visualizations including bar plots of missing values by variable, correlation heatmap for numerical features, and histograms with boxplots for numerical features.
pip install myawesomeeda
from my_awesome_eda import run_eda
- Demonstrational python notebook is available in
demo.ipynb
file
🔗 Visit MyAwesomeEDA wiki page
Contributions are welcome! If you have any ideas, bug fixes, or enhancements, feel free to open an issue or submit a pull request.
For any inquiries or support, feel free to contact me via email
Happy data exploring! 💻🧐