fairscoring.metrics.integral

fairscoring.metrics.integral#

Intergal bias metrics that measure the differences between cumulative distribution functions.

Classes#

`IntegralMetric`	Base Class for Integral Metrics that compare cdfs.
`IntegralBiasResult`	An extended bias result that also stores groupwise cumulative distribution functions (cdfs).
`WassersteinMetric`	A metric that measures the differences between distributions using the Wasserstein Distance [BeDB24].

Module Contents#

class fairscoring.metrics.integral.IntegralMetric(fairness_type, name, score_transform=None)#

Bases: fairscoring.metrics.base.TwoGroupMetric

Base Class for Integral Metrics that compare cdfs.

Parameters:

fairness_type ({"IND", "EO", "PE"}) – Specifies the type of fairness that is measured. Accepted values are: 1. “IND” (Independence), 2. “EO” (Equal Opportunity), 3. “PE” (Predictive Equality),
name (str) – Name of the Metric
score_transform ({"rescale","quantile",None}) –
A transformation of the scores prior to the bias computation. There are two supported methods:
- rescaling (to the interval [0,1]. In this case, the bias() method can take min and max scores.
- quantile transformation. This leads to standardized bias measures.

bias(scores, target, attribute, groups, favorable_target, *, min_score=None, max_score=None, n_permute=None, seed=None, prefer_high_scores=True)#

Bias computation

Parameters:

scores (ArrayLike) – A list of scores
target (ArrayLike) – The binary target values. Must have the same length as scores.
attribute (ndarray) – The protected attribute. Must have the same length as scores.
groups (list) – A list of groups. Each group is given by a value of the protected attribute. A value of None is used to define a group with all elements that are not in another group.
favorable_target (str or int) – The favorable outcome
min_score (float) – The minimal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling.
max_score (float) – The maximal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling
n_permute (int, optional) – Number of iterations for the permutation test. Permutation tests are only performed if this value is >0.
prefer_high_scores (bool, optional) – Specify whether high scores or low scores are favorable.
seed (int, optional) – Random seed for the permutation test. Only required if the result need to be 100% reproducible.

Returns:

bias – The computed bias (including intermediate results)

Return type:

BiasResult

__call__(scores, target, attribute, groups, favorable_target, *, min_score=None, max_score=None, prefer_high_scores=True)#

Bias computation.

This method allows to use the bias metric as a function.

Parameters:

scores (ArrayLike) – A list of scores
target (ArrayLike) – The binary target values. Must have the same length as scores.
attribute (ndarray) – The protected attribute. Must have the same length as scores.
groups (list) – A list of groups. Each group is given by a value of the protected attribute. A value of None is used to define a group with all elements that are not in another group.
favorable_target (str or int) – The favorable outcome
min_score (float) – The minimal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling.
max_score (float) – The maximal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling
prefer_high_scores (bool, optional) – Specify whether high scores or low scores are favorable.

Returns:

bias – The computed bias.

Return type:

float

Notes

This method offers fewer parameters than bias(), because not all will affect the pure bias value.

class fairscoring.metrics.integral.IntegralBiasResult(bias, pos, neg, cdf_x, cdfs)#

Bases: fairscoring.metrics.base.TwoGroupBiasResult

An extended bias result that also stores groupwise cumulative distribution functions (cdfs).

Parameters:

bias (float) – The bias value
pos (float) – The positive component of the bias
neg (float) – The negative component of the bias
cdf_x (ArrayLike) – x-values at which the cdfs are stored. This array is 1-dimensional
cdfs (List of ArrayLike) – A list of cdfs.

Variables:

bias (float) – The bias value
pos (float) – The positive component of the bias
neg (float) – The negative component of the bias
cdf_x (ArrayLike) – x-values at which the cdfs are stored. This array is 1-dimensional
cdfs (List of ArrayLike) – A list of cdfs.

property pos_component#

Proportion of the positive component in the total bias

Return type:: Proportion of the positive component in the total bias

property neg_component#

Proportion of the negative component in the total bias

Return type:: Proportion of the negative component in the total bias

class fairscoring.metrics.integral.WassersteinMetric(fairness_type, name, score_transform=None, p=1)#

Bases: IntegralMetric

A metric that measures the differences between distributions using the Wasserstein Distance [BeDB24].

This metric can be used to measure independence and separation bias. The fairness_type-parameter specifies which bias to measure and hence which distribution will be compared.

This metric returns a IntegralBiasResult object, which allows to split the bias in positive and negative parts. Furthermore, it stores the cumulative distribution functions of the groups.

Parameters:

fairness_type ({"IND", "EO", "PE"}) – Specifies the type of fairness that is measured. Accepted values are: 1. “IND” (Independence), 2. “EO” (Equal Opportunity), 3. “PE” (Predictive Equality),
name (str) – Name of the Metric
score_transform ({"rescale","quantile",None}) –
A transformation of the scores prior to the bias computation. There are two supported methods:
- rescaling (to the interval [0,1]. In this case, the bias() method can take min and max scores.
- quantile transformation. This leads to standardized bias measures.
p (float, default=1) – Exponent for the Wasserstein Distance. Use the default of 1 to get the Earthmover Distance

bias(scores, target, attribute, groups, favorable_target, *, min_score=None, max_score=None, n_permute=None, seed=None, prefer_high_scores=True)#

Bias computation

Parameters:

scores (ArrayLike) – A list of scores
target (ArrayLike) – The binary target values. Must have the same length as scores.
attribute (ndarray) – The protected attribute. Must have the same length as scores.
groups (list) – A list of groups. Each group is given by a value of the protected attribute. A value of None is used to define a group with all elements that are not in another group.
favorable_target (str or int) – The favorable outcome
min_score (float) – The minimal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling.
max_score (float) – The maximal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling
n_permute (int, optional) – Number of iterations for the permutation test. Permutation tests are only performed if this value is >0.
prefer_high_scores (bool, optional) – Specify whether high scores or low scores are favorable.
seed (int, optional) – Random seed for the permutation test. Only required if the result need to be 100% reproducible.

Returns:

bias – The computed bias (including intermediate results)

Return type:

BiasResult

__call__(scores, target, attribute, groups, favorable_target, *, min_score=None, max_score=None, prefer_high_scores=True)#

Bias computation.

This method allows to use the bias metric as a function.

Parameters:

scores (ArrayLike) – A list of scores
target (ArrayLike) – The binary target values. Must have the same length as scores.
attribute (ndarray) – The protected attribute. Must have the same length as scores.
groups (list) – A list of groups. Each group is given by a value of the protected attribute. A value of None is used to define a group with all elements that are not in another group.
favorable_target (str or int) – The favorable outcome
min_score (float) – The minimal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling.
max_score (float) – The maximal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling
prefer_high_scores (bool, optional) – Specify whether high scores or low scores are favorable.

Returns:

bias – The computed bias.

Return type:

float

Notes

This method offers fewer parameters than bias(), because not all will affect the pure bias value.