

.. _sphx_glr_auto_examples_evaluation_plot_metrics.py:


=======================================
Metrics specific to imbalanced learning
=======================================

Specific metrics have been developed to evaluate classifier which
has been trained using imbalanced data. `imblearn` provides mainly
two additional metrics which are not implemented in `sklearn`: (i)
geometric mean and (ii) index balanced accuracy.



.. code-block:: python


    # Authors: Guillaume Lemaitre <g.lemaitre58@gmail.com>
    # License: MIT

    from sklearn import datasets
    from sklearn.svm import LinearSVC
    from sklearn.model_selection import train_test_split

    from imblearn import over_sampling as os
    from imblearn import pipeline as pl
    from imblearn.metrics import (geometric_mean_score,
                                  make_index_balanced_accuracy)

    print(__doc__)

    RANDOM_STATE = 42

    # Generate a dataset
    X, y = datasets.make_classification(n_classes=3, class_sep=2,
                                        weights=[0.1, 0.9], n_informative=10,
                                        n_redundant=1, flip_y=0, n_features=20,
                                        n_clusters_per_class=4, n_samples=5000,
                                        random_state=RANDOM_STATE)

    pipeline = pl.make_pipeline(os.SMOTE(random_state=RANDOM_STATE),
                                LinearSVC(random_state=RANDOM_STATE))

    # Split the data
    X_train, X_test, y_train, y_test = train_test_split(X, y,
                                                        random_state=RANDOM_STATE)

    # Train the classifier with balancing
    pipeline.fit(X_train, y_train)

    # Test the classifier and get the prediction
    y_pred_bal = pipeline.predict(X_test)







The geometric mean corresponds to the square root of the product of the
sensitivity and specificity. Combining the two metrics should account for
the balancing of the dataset.



.. code-block:: python


    print('The geometric mean is {}'.format(geometric_mean_score(
        y_test,
        y_pred_bal)))





.. rst-class:: sphx-glr-script-out

 Out::

    The geometric mean is 0.9262633940760341


The index balanced accuracy can transform any metric to be used in
imbalanced learning problems.



.. code-block:: python


    alpha = 0.1
    geo_mean = make_index_balanced_accuracy(alpha=alpha, squared=True)(
        geometric_mean_score)

    print('The IBA using alpha = {} and the geometric mean: {}'.format(
        alpha, geo_mean(
            y_test,
            y_pred_bal)))

    alpha = 0.5
    geo_mean = make_index_balanced_accuracy(alpha=alpha, squared=True)(
        geometric_mean_score)

    print('The IBA using alpha = {} and the geometric mean: {}'.format(
        alpha, geo_mean(
            y_test,
            y_pred_bal)))




.. rst-class:: sphx-glr-script-out

 Out::

    The IBA using alpha = 0.1 and the geometric mean: 0.8579638752052544
    The IBA using alpha = 0.5 and the geometric mean: 0.8579638752052544


**Total running time of the script:** ( 0 minutes  0.352 seconds)



.. container:: sphx-glr-footer


  .. container:: sphx-glr-download

     :download:`Download Python source code: plot_metrics.py <plot_metrics.py>`



  .. container:: sphx-glr-download

     :download:`Download Jupyter notebook: plot_metrics.ipynb <plot_metrics.ipynb>`

.. rst-class:: sphx-glr-signature

    `Generated by Sphinx-Gallery <https://sphinx-gallery.readthedocs.io>`_
