Metadata-Version: 2.4
Name: shahEDA
Version: 0.1.1
Summary: Lightweight EDA tool for automated data analysis with insights and visualizations
Author: Syed Haseeb Shah
Project-URL: Homepage, https://github.com/codingsheep17/shahEDA.git
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: pandas>=1.5
Requires-Dist: openpyxl>=3.1

# shahEDA 🚀
Lightweight Automated Exploratory Data Analysis Tool

---
## 📌 Overview
**shahEDA** is a lightweight Python package that automates the process of Exploratory Data Analysis (EDA).
It helps you quickly understand your dataset by generating structured insights, statistics, and suggestions — all in one run.

---
## ⚡ Features
* 📂 Load datasets (CSV supported)
* 📊 Basic dataset overview (shape, columns, data types)
* ❌ Missing values detection
* 🔁 Duplicate detection
* 📈 Statistical summaries (mean, std, quartiles, etc.)
* 🧠 Smart suggestions for preprocessing & ML readiness
* 🎛️ Optional verbose and display modes

---
## 🛠️ Installation
Clone the repository:
git clone https://github.com/codingsheep17/shahEDA.git
cd shahEDA
Install dependencies:
pip install -r requirements.txt

---
## 🚀 Usage
### 🔹 Simple Usage (Recommended)
from shahEDA import analyze
results = analyze("your_dataset.csv", verbose=True, display=True)

---
### 🔹 Advanced Usage
from shahEDA import AnalyzerEDA
eda = AnalyzerEDA("your_dataset.csv", verbose=True, display=True)
results = eda.run()

---
## 📊 Example Output
Loaded file in csv format
=== BASIC INFO ===
Shape: (100, 14)
Columns: [...]

=== MISSING VALUES ===
No missing values found

=== DUPLICATES ===
0 duplicates

=== STATISTICS ===
(mean, std, quartiles...)

=== SUGGESTIONS ===
- Encode categorical columns for ML
- Check feature distributions


---
## 🧠 Suggestions System
The tool automatically identifies:
* Categorical columns → suggests encoding
* Data readiness for ML
* Potential preprocessing steps

---
## 📁 Project Structure
shahEDA/
│
├── __init__.py        # Public API
├── pipeline.py        # Main Analyzer class
├── _loader.py         # Data loading
├── _eda_engine.py     # Core EDA logic
├── _stats.py          # Statistics

---
## 🎯 Goal
To simplify and automate EDA so developers and data enthusiasts can:
* Save time
* Focus on insights
* Prepare data faster for ML models

---
## 🚧 Future Improvements
* 📊 Advanced visualizations
* 📉 Outlier detection
* 🔍 Correlation analysis
* 📦 PyPI package release

---
## 👨‍💻 Author
**Syed Haseeb Shah**
* GitHub: https://github.com/codingsheep17

---
## ⭐ Support
If you find this useful, consider giving it a ⭐ on GitHub!
