Metadata-Version: 2.4
Name: metacluster
Version: 1.3.0
Summary: MetaCluster: An Open-Source Python Library for Metaheuristic-based Clustering Problems
Home-page: https://github.com/thieu1995/metacluster
Author: Thieu
Author-email: nguyenthieu2102@gmail.com
License: GPLv3
Project-URL: Documentation, https://metacluster.readthedocs.io/
Project-URL: Source Code, https://github.com/thieu1995/metacluster
Project-URL: Bug Tracker, https://github.com/thieu1995/metacluster/issues
Project-URL: Change Log, https://github.com/thieu1995/metacluster/blob/master/ChangeLog.md
Project-URL: Forum, https://t.me/+fRVCJGuGJg1mNDg1
Keywords: clustering,optimization,k-center clustering,data points,centers,euclidean distance,maximum distance,NP-hard,greedy algorithm,approximation algorithm,covering problem,computational complexity,geometric algorithms,machine learning,pattern recognition,spatial analysis,graph theory,mathematical optimization,dimensionality reduction,mutual information,correlation-based feature selection,Genetic algorithm (GA),Particle swarm optimization (PSO),Ant colony optimization (ACO),Differential evolution (DE),Simulated annealing,Grey wolf optimizer (GWO),Whale Optimization Algorithm (WOA),confusion matrix,recall,precision,accuracy,K-Nearest Neighbors,pearson correlation coefficient (PCC),spearman correlation coefficient (SCC),multi-objectives optimization problems,Stochastic optimization,Global optimization,Convergence analysis,Search space exploration,Local search,Computational intelligence,Robust optimization,Performance analysis,Intelligent optimization,Simulations
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Information Technology
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: System :: Benchmark
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Scientific/Engineering :: Visualization
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Software Development :: Build Tools
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Utilities
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy<=1.26.0,>=1.17.1
Requires-Dist: scipy>=1.7.1
Requires-Dist: scikit-learn>=1.0.2
Requires-Dist: pandas>=1.3.5
Requires-Dist: mealpy>=3.0.1
Requires-Dist: permetrics>=1.5.0
Requires-Dist: plotly>=5.10.0
Requires-Dist: kaleido>=0.2.1
Provides-Extra: dev
Requires-Dist: pytest>=7.1.2; extra == "dev"
Requires-Dist: twine>=4.0.1; extra == "dev"
Requires-Dist: pytest-cov==4.0.0; extra == "dev"
Requires-Dist: flake8>=4.0.1; extra == "dev"
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: license
Dynamic: license-file
Dynamic: project-url
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary


<p align="center">
<img style="max-width:100%;" src="https://thieu1995.github.io/post/2023-08/MetaCluster-01.png" alt="MetaCluster"/>
</p>

---

[![GitHub release](https://img.shields.io/badge/release-1.3.0-yellow.svg)](https://github.com/thieu1995/metacluster/releases)
[![Wheel](https://img.shields.io/pypi/wheel/gensim.svg)](https://pypi.python.org/pypi/metacluster) 
[![PyPI version](https://badge.fury.io/py/metacluster.svg)](https://badge.fury.io/py/metacluster)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/metacluster.svg)
![PyPI - Status](https://img.shields.io/pypi/status/metacluster.svg)
[![Downloads](https://static.pepy.tech/badge/MetaCluster)](https://pepy.tech/project/MetaCluster)
[![Tests & Publishes to PyPI](https://github.com/thieu1995/metacluster/actions/workflows/publish-package.yaml/badge.svg)](https://github.com/thieu1995/metacluster/actions/workflows/publish-package.yaml)
![GitHub Release Date](https://img.shields.io/github/release-date/thieu1995/metacluster.svg)
[![Documentation Status](https://readthedocs.org/projects/metacluster/badge/?version=latest)](https://metacluster.readthedocs.io/en/latest/?badge=latest)
[![Chat](https://img.shields.io/badge/Chat-on%20Telegram-blue)](https://t.me/+fRVCJGuGJg1mNDg1)
![GitHub contributors](https://img.shields.io/github/contributors/thieu1995/metacluster.svg)
[![GitTutorial](https://img.shields.io/badge/PR-Welcome-%23FF8300.svg?)](https://git-scm.com/book/en/v2/GitHub-Contributing-to-a-Project)
[![DOI](https://zenodo.org/badge/670197315.svg)](https://zenodo.org/badge/latestdoi/670197315)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)


MetaCluster is the largest open-source nature-inspired optimization (Metaheuristic Algorithms) library for 
clustering problem in Python

* **Free software:** GNU General Public License (GPL) V3 license
* **Provided 3 classes: `MetaCluster`, `MhaKCentersClustering`, and `MhaKMeansTuner`**
* **Total nature-inspired metaheuristic optimizers (Metaheuristic Algorithms)**: > 200 optimizers
* **Total objective functions (as fitness)**: > 40 objectives
* **Total supported datasets**: 48 datasets from Scikit learn, UCI, ELKI, KEEL...
* **Total performance metrics**: > 40 metrics
* **Total different way of detecting the K value**: >= 10 methods
* **Documentation:** https://metacluster.readthedocs.io/en/latest/
* **Python versions:** >= 3.7.x
* **Dependencies:** numpy, scipy, scikit-learn, pandas, mealpy, permetrics, plotly, kaleido



# Citation Request

Please include these citations if you plan to use this library:

```code
@article{VanThieu2023,
  author = {Van Thieu,  Nguyen and Oliva,  Diego and Pérez-Cisneros,  Marco},
  title = {MetaCluster: An open-source Python library for metaheuristic-based clustering problems},
  journal = {SoftwareX},
  year = {2023},
  pages = {101597},
  volume = {24},
  DOI = {10.1016/j.softx.2023.101597},
}

@article{van2023mealpy,
  title={MEALPY: An open-source library for latest meta-heuristic algorithms in Python},
  author={Van Thieu, Nguyen and Mirjalili, Seyedali},
  journal={Journal of Systems Architecture},
  year={2023},
  publisher={Elsevier},
  doi={10.1016/j.sysarc.2023.102871}
}
```


# Installation

* Install the [current PyPI release](https://pypi.python.org/pypi/metacluster):
```bash
$ pip install metacluster
```

After installation, check the version:
```bash
$ python
>>> import metacluster
>>> metacluster.__version__
```

### Examples

We implement a dedicated Github repository for examples at [MetaCluster_examples](https://github.com/thieu1995/MetaCluster_examples)

Let's go through some basic examples from here:

#### 1. First, load dataset. You can use the available datasets from MetaCluster:

```python
# Load available dataset from MetaCluster
from metacluster import get_dataset

# Try unknown data
get_dataset("unknown")
# Enter: 1      -> This wil list all of avaialble dataset

data = get_dataset("Arrhythmia")
```

* Or you can load your own dataset 

```python
import pandas as pd
from metacluster import Data

# load X and y
# NOTE MetaCluster accepts numpy arrays only, hence use the .values attribute
dataset = pd.read_csv('examples/dataset.csv', index_col=0).values
X, y = dataset[:, 0:-1], dataset[:, -1]
data = Data(X, y, name="my-dataset")
```

#### 2. Next, scale your features

**You should confirm that your dataset is scaled and normalized**

```python
# MinMaxScaler 
data.X, scaler = data.scale(data.X, method="MinMaxScaler", feature_range=(0, 1))

# StandardScaler 
data.X, scaler = data.scale(data.X, method="StandardScaler")

# MaxAbsScaler 
data.X, scaler = data.scale(data.X, method="MaxAbsScaler")

# RobustScaler 
data.X, scaler = data.scale(data.X, method="RobustScaler")

# Normalizer 
data.X, scaler = data.scale(data.X, method="Normalizer", norm="l2")   # "l1" or "l2" or "max"
```


#### 3. Next, select Metaheuristic Algorithm, Its parameters, list of objectives, and list of performance metrics 

```python
list_optimizer = ["BaseFBIO", "OriginalGWO", "OriginalSMA"]
list_paras = [
    {"name": "FBIO", "epoch": 10, "pop_size": 30},
    {"name": "GWO", "epoch": 10, "pop_size": 30},
    {"name": "SMA", "epoch": 10, "pop_size": 30}
]
list_obj = ["SI", "RSI"]
list_metric = ["BHI", "DBI", "DI", "CHI", "SSEI", "NMIS", "HS", "CS", "VMS", "HGS"]
```

You can check all supported metaheuristic algorithms from: https://github.com/thieu1995/mealpy.
All supported clustering objectives and metrics from: https://github.com/thieu1995/permetrics.

If you don't want to read the documents, you can print out all supported information by:

```python
from metacluster import MetaCluster 

# Get all supported methods and print them out
MetaCluster.get_support(name="all")
```


#### 4. Next, create an instance of MetaCluster class and run it.

```python
model = MetaCluster(list_optimizer=list_optimizer, list_paras=list_paras, list_obj=list_obj, n_trials=3, seed=10)

model.execute(data=data, cluster_finder="elbow", list_metric=list_metric, save_path="history", verbose=False)

model.save_boxplots()
model.save_convergences()
```

As you can see, you can define different datasets and using the same model to run it. 
Remember to set the name to your dataset, because the folder that hold your results is the name of your dataset.
More examples can be found [here](/examples)


# Support 

### Official links (questions, problems)

* Official source code repo: https://github.com/thieu1995/metacluster
* Official document: https://metacluster.readthedocs.io/
* Download releases: https://pypi.org/project/metacluster/
* Issue tracker: https://github.com/thieu1995/metacluster/issues
* Notable changes log: https://github.com/thieu1995/metacluster/blob/master/ChangeLog.md
* Official chat group: https://t.me/+fRVCJGuGJg1mNDg1

* This project also related to our another projects which are optimization and machine learning. Check it here:
    * https://github.com/thieu1995/metaheuristics
    * https://github.com/thieu1995/mealpy
    * https://github.com/thieu1995/mafese
    * https://github.com/thieu1995/pfevaluator
    * https://github.com/thieu1995/opfunu
    * https://github.com/thieu1995/enoppy
    * https://github.com/thieu1995/permetrics
    * https://github.com/thieu1995/IntelELM
    * https://github.com/thieu1995/MetaPerceptron
    * https://github.com/thieu1995/GrafoRVFL
    * https://github.com/aiir-team


### Supported links 

```code
1. https://jtemporal.com/kmeans-and-elbow-method/
2. https://medium.com/@masarudheena/4-best-ways-to-find-optimal-number-of-clusters-for-clustering-with-python-code-706199fa957c
3. https://github.com/minddrummer/gap/blob/master/gap/gap.py
4. https://www.tandfonline.com/doi/pdf/10.1080/03610927408827101
5. https://doi.org/10.1016/j.engappai.2018.03.013
6. https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Clustering-Dimensionality-Reduction/Clustering_metrics.ipynb
7. https://elki-project.github.io/
8. https://sci2s.ugr.es/keel/index.php
9. https://archive.ics.uci.edu/datasets
10. https://python-charts.com/distribution/box-plot-plotly/
11. https://plotly.com/python/box-plots/?_ga=2.50659434.2126348639.1688086416-114197406.1688086416#box-plot-styling-mean--standard-deviation
```
