fairscoring.metrics.integral#

Intergal bias metrics that measure the differences between cumulative distribution functions.

Classes#

IntegralMetric

Base Class for Integral Metrics that compare cdfs.

IntegralBiasResult

An extended bias result that also stores groupwise cumulative distribution functions (cdfs).

WassersteinMetric

A metric that measures the differences between distributions using the Wasserstein Distance [BeDB24].

Module Contents#

class fairscoring.metrics.integral.IntegralMetric(fairness_type, name, score_transform=None)#

Bases: fairscoring.metrics.base.TwoGroupMetric

Base Class for Integral Metrics that compare cdfs.

Parameters:
  • fairness_type ({"IND", "EO", "PE"}) – Specifies the type of fairness that is measured. Accepted values are: 1. “IND” (Independence), 2. “EO” (Equal Opportunity), 3. “PE” (Predictive Equality),

  • name (str) – Name of the Metric

  • score_transform ({"rescale","quantile",None}) –

    A transformation of the scores prior to the bias computation. There are two supported methods:

    • rescaling (to the interval [0,1]. In this case, the bias() method can take min and max scores.

    • quantile transformation. This leads to standardized bias measures.

bias(scores, target, attribute, groups, favorable_target, *, min_score=None, max_score=None, n_permute=None, seed=None, prefer_high_scores=True)#

Bias computation

Parameters:
  • scores (ArrayLike) – A list of scores

  • target (ArrayLike) – The binary target values. Must have the same length as scores.

  • attribute (ndarray) – The protected attribute. Must have the same length as scores.

  • groups (list) – A list of groups. Each group is given by a value of the protected attribute. A value of None is used to define a group with all elements that are not in another group.

  • favorable_target (str or int) – The favorable outcome

  • min_score (float) – The minimal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling.

  • max_score (float) – The maximal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling

  • n_permute (int, optional) – Number of iterations for the permutation test. Permutation tests are only performed if this value is >0.

  • prefer_high_scores (bool, optional) – Specify whether high scores or low scores are favorable.

  • seed (int, optional) – Random seed for the permutation test. Only required if the result need to be 100% reproducible.

Returns:

bias – The computed bias (including intermediate results)

Return type:

BiasResult

__call__(scores, target, attribute, groups, favorable_target, *, min_score=None, max_score=None, prefer_high_scores=True)#

Bias computation.

This method allows to use the bias metric as a function.

Parameters:
  • scores (ArrayLike) – A list of scores

  • target (ArrayLike) – The binary target values. Must have the same length as scores.

  • attribute (ndarray) – The protected attribute. Must have the same length as scores.

  • groups (list) – A list of groups. Each group is given by a value of the protected attribute. A value of None is used to define a group with all elements that are not in another group.

  • favorable_target (str or int) – The favorable outcome

  • min_score (float) – The minimal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling.

  • max_score (float) – The maximal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling

  • prefer_high_scores (bool, optional) – Specify whether high scores or low scores are favorable.

Returns:

bias – The computed bias.

Return type:

float

Notes

This method offers fewer parameters than bias(), because not all will affect the pure bias value.

class fairscoring.metrics.integral.IntegralBiasResult(bias, pos, neg, cdf_x, cdfs)#

Bases: fairscoring.metrics.base.TwoGroupBiasResult

An extended bias result that also stores groupwise cumulative distribution functions (cdfs).

Parameters:
  • bias (float) – The bias value

  • pos (float) – The positive component of the bias

  • neg (float) – The negative component of the bias

  • cdf_x (ArrayLike) – x-values at which the cdfs are stored. This array is 1-dimensional

  • cdfs (List of ArrayLike) – A list of cdfs.

Variables:
  • bias (float) – The bias value

  • pos (float) – The positive component of the bias

  • neg (float) – The negative component of the bias

  • cdf_x (ArrayLike) – x-values at which the cdfs are stored. This array is 1-dimensional

  • cdfs (List of ArrayLike) – A list of cdfs.

property pos_component#
Proportion of the positive component in the total bias
Return type:

Proportion of the positive component in the total bias

property neg_component#
Proportion of the negative component in the total bias
Return type:

Proportion of the negative component in the total bias

class fairscoring.metrics.integral.WassersteinMetric(fairness_type, name, score_transform=None, p=1)#

Bases: IntegralMetric

A metric that measures the differences between distributions using the Wasserstein Distance [BeDB24].

This metric can be used to measure independence and separation bias. The fairness_type-parameter specifies which bias to measure and hence which distribution will be compared.

This metric returns a IntegralBiasResult object, which allows to split the bias in positive and negative parts. Furthermore, it stores the cumulative distribution functions of the groups.

Parameters:
  • fairness_type ({"IND", "EO", "PE"}) – Specifies the type of fairness that is measured. Accepted values are: 1. “IND” (Independence), 2. “EO” (Equal Opportunity), 3. “PE” (Predictive Equality),

  • name (str) – Name of the Metric

  • score_transform ({"rescale","quantile",None}) –

    A transformation of the scores prior to the bias computation. There are two supported methods:

    • rescaling (to the interval [0,1]. In this case, the bias() method can take min and max scores.

    • quantile transformation. This leads to standardized bias measures.

  • p (float, default=1) – Exponent for the Wasserstein Distance. Use the default of 1 to get the Earthmover Distance

bias(scores, target, attribute, groups, favorable_target, *, min_score=None, max_score=None, n_permute=None, seed=None, prefer_high_scores=True)#

Bias computation

Parameters:
  • scores (ArrayLike) – A list of scores

  • target (ArrayLike) – The binary target values. Must have the same length as scores.

  • attribute (ndarray) – The protected attribute. Must have the same length as scores.

  • groups (list) – A list of groups. Each group is given by a value of the protected attribute. A value of None is used to define a group with all elements that are not in another group.

  • favorable_target (str or int) – The favorable outcome

  • min_score (float) – The minimal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling.

  • max_score (float) – The maximal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling

  • n_permute (int, optional) – Number of iterations for the permutation test. Permutation tests are only performed if this value is >0.

  • prefer_high_scores (bool, optional) – Specify whether high scores or low scores are favorable.

  • seed (int, optional) – Random seed for the permutation test. Only required if the result need to be 100% reproducible.

Returns:

bias – The computed bias (including intermediate results)

Return type:

BiasResult

__call__(scores, target, attribute, groups, favorable_target, *, min_score=None, max_score=None, prefer_high_scores=True)#

Bias computation.

This method allows to use the bias metric as a function.

Parameters:
  • scores (ArrayLike) – A list of scores

  • target (ArrayLike) – The binary target values. Must have the same length as scores.

  • attribute (ndarray) – The protected attribute. Must have the same length as scores.

  • groups (list) – A list of groups. Each group is given by a value of the protected attribute. A value of None is used to define a group with all elements that are not in another group.

  • favorable_target (str or int) – The favorable outcome

  • min_score (float) – The minimal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling.

  • max_score (float) – The maximal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling

  • prefer_high_scores (bool, optional) – Specify whether high scores or low scores are favorable.

Returns:

bias – The computed bias.

Return type:

float

Notes

This method offers fewer parameters than bias(), because not all will affect the pure bias value.