Metadata-Version: 2.1
Name: tab-synthgen
Version: 0.1.0
Summary: A private package bundling RCTGAN for tabular synthetic data generation.
Home-page: https://github.com/PriyeshDave/SynthGen
Author: Priyesh Dave
Author-email: genaiwork6@gmail.com
License: UNKNOWN
Description: # 🚀 SyntheGen – A Framework for Synthetic Data Generation  
        
        **SyntheGen** is a powerful **ML/DL-based synthetic data generation framework** that creates high-quality **tabular synthetic datasets** while preserving the statistical properties of real data. Built with **Streamlit for UI**, it provides an interactive way to analyze and generate synthetic data.  
        
        ---
        
        ## 🎯 Features  
        
        ✅ **Upload Real Tabular Data** – Supports numerical & categorical features  
        ✅ **Visualize Data Distributions** – Gaussian plots, box plots, violin plots, categorical distributions  
        ✅ **Generate Synthetic Data** – Uses ML/DL models like **CTGAN, TVAE, Gaussian Copula**  
        ✅ **Compare Real vs. Synthetic Data** – Side-by-side visualization of distributions  
        ✅ **Download Synthetic Datasets** – Export the generated data for ML training & analysis  
        
        ---
        
        ## 🛠️ Tech Stack  
        
        - **Python 3.9**  
        - **Streamlit** (for interactive UI)  
        - **SDV (Synthetic Data Vault)** – CTGAN, TVAE, Gaussian Copula  
        - **Pandas, Seaborn, Matplotlib** (for statistical analysis & visualization)  
        
        ---
        
        ## 📦 Installation  
        
        1️⃣ Clone the repository:  
        ```bash
        git clone https://github.com/your-repo/synthegen.git  
        cd synthegen
        
        2️⃣ Install dependencies:
        
        pip install -r requirements.txt
        
        3️⃣ Run the Streamlit app:
        
        streamlit run app.py
        
        📌 Usage
        	1.	Upload your tabular dataset (CSV format)
        	2.	View statistical distributions of your data
        	3.	Generate synthetic data using advanced ML models
        	4.	Compare real vs. synthetic data distributions
        	5.	Download the generated dataset
        
        🔮 Future Enhancements
        
        ✅ Text Data Generation Support (Placeholder already added for easy expansion)
        ✅ Customizable Model Selection (Choose from different synthetic data models)
        ✅ Advanced Outlier Handling & Feature Engineering (More robust pre-processing methods)
        
        🤝 Contributing
        
        We welcome contributions! Feel free to:
        	•	Report issues by opening a GitHub issue
        	•	Submit PRs with improvements & feature additions
        	•	Suggest ideas for enhancements
        
        📜 License
        
        This project is licensed under the MIT License – see the LICENSE file for details.
        
        📧 Contact
        
        For any questions or suggestions, reach out via:
        📩 Email: genaiwork6@gmail.com
        🌐 GitHub: https://github.com/PriyeshDave
        
        🚀 Let’s redefine synthetic data generation!
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.7
Description-Content-Type: text/markdown
