Evaluation#
To import the following functions:
from maddlib import evaluation
Compute MADD#
The evaluation.MADD() function which computes MADD depends on two other functions that are described below: evaluation.separate_pred_proba() and evaluation.normalized_density_vector().
- evaluation.MADD(h, X_test=None, pred_proba=None, sf=None, pred_proba_sf0=None, pred_proba_sf1=None, min_nb_points=50)#
Compute MADD.
- Parameters:
h (float or str) – Bandwidth parameter (either a float \(\in \left]0, 1\right[\) or
'auto'for an automatic computation of the optimal bandiwdth).X_test (pandas.DataFrame or None) – Optional test set (without labels) on which to evaluate MADD. If
X_testis given,preb_probaandsfare also expected to be given.pred_proba (numpy.ndarray of shape (n, 1) or None) – Optional predicted probabilities (associated to the test set
X_test) on which to evaluate MADD. Ifpred_probais given,X_testandsfare also expected to be given.sf (str or None) – Optional sensitive feature name (from the test set
X_test) with which to evaluate MADD. Ifsfis given,X_testandpred_probaare also expected to be given.pred_proba_sf0 (numpy.ndarray of shape (n, 1) or None) – Optional predicted probabilities of group 0 with which to evaluate MADD. If
pred_proba_sf0is given,pred_proba_sf1is also expected to be given.pred_proba_sf1 (numpy.ndarray of shape (n, 1) or None) – Optional predicted probabilities of group 1 with which to evaluate MADD. If
pred_proba_sf1is given,pred_proba_sf0is also expected to be given.min_nb_points (int or None) – Optional minimum number of points to consider in the bandwidth interval.
- Returns:
MADD result
- Return type:
float
- evaluation.separate_pred_proba(X, pred_proba, sf)#
Return the separated predicted probabilities according the sensitive feature.
- Parameters:
X (pandas.DataFrame) – The feature set.
pred_proba (numpy.ndarray of shape (n, 1)) – The predicted probabilities (of positive predictions).
sf (str) – Sensitive feature name included in the feature set
X.
- Returns:
The couple of predicted probabilities separated (pred_proba_sf0, pred_proba_sf1)
- Return type:
couple of numpy.ndarray
- evaluation.normalized_density_vector(pred_proba_sfi, e)#
Compute the density vector for a group (\(D_{G_0}\) or \(D_{G_1}\)).
- Parameters:
pred_proba_sfi (numpy.ndarray of shape (n, 1)) – The predicted probabilities (of positive predictions) for one group.
e (float) – Bandwidth parameter.
- Returns:
The density vector
- Return type:
numpy.ndarray
Display MADD results#
To retrieve a list of random ingredients,
you can use the lumache.get_random_ingredients() function:
- evaluation.madd_plot(h, pred_proba_sf0, pred_proba_sf1, legend_groups, title, figsize=(12, 4))#
Return a plot of a visual approximation of the resulting MADD for graphical analysis.
- Parameters:
h (float or str) – Bandwidth parameter (either a float \(\in \left]0, 1\right[\) or
'auto'for an automatic computation of the optimal bandiwdth).pred_proba_sf0 (numpy.ndarray of shape (n, 1)) – The predicted probabilities of group 0 with which to evaluate MADD.
pred_proba_sf1 (numpy.ndarray of shape (n, 1)) – The predicted probabilities of group 1 with which to evaluate MADD.
legend_groups (str or 2-tuple) – The name of the sensitive feature or the names of the two groups in a 2-tuple.
title (str) – The title of the graph (it could be the name of the model that outputs the predicted probabilities).
- Returns:
Plot
- Return type:
matplotlib.figure.Figure