fairscoring.metrics.integral#
Intergal bias metrics that measure the differences between cumulative distribution functions.
Classes#
Base Class for Integral Metrics that compare cdfs. |
|
An extended bias result that also stores groupwise cumulative distribution functions (cdfs). |
|
A metric that measures the differences between distributions using the Wasserstein Distance [BeDB24]. |
Module Contents#
- class fairscoring.metrics.integral.IntegralMetric(fairness_type, name, score_transform=None)#
Bases:
fairscoring.metrics.base.TwoGroupMetricBase Class for Integral Metrics that compare cdfs.
- Parameters:
fairness_type ({"IND", "EO", "PE"}) – Specifies the type of fairness that is measured. Accepted values are: 1. “IND” (Independence), 2. “EO” (Equal Opportunity), 3. “PE” (Predictive Equality),
name (str) – Name of the Metric
score_transform ({"rescale","quantile",None}) –
A transformation of the scores prior to the bias computation. There are two supported methods:
rescaling (to the interval [0,1]. In this case, the
bias()method can take min and max scores.quantile transformation. This leads to standardized bias measures.
- bias(scores, target, attribute, groups, favorable_target, *, min_score=None, max_score=None, n_permute=None, seed=None, prefer_high_scores=True)#
Bias computation
- Parameters:
scores (ArrayLike) – A list of scores
target (ArrayLike) – The binary target values. Must have the same length as scores.
attribute (ndarray) – The protected attribute. Must have the same length as scores.
groups (list) – A list of groups. Each group is given by a value of the protected attribute. A value of None is used to define a group with all elements that are not in another group.
favorable_target (str or int) – The favorable outcome
min_score (float) – The minimal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling.
max_score (float) – The maximal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling
n_permute (int, optional) – Number of iterations for the permutation test. Permutation tests are only performed if this value is >0.
prefer_high_scores (bool, optional) – Specify whether high scores or low scores are favorable.
seed (int, optional) – Random seed for the permutation test. Only required if the result need to be 100% reproducible.
- Returns:
bias – The computed bias (including intermediate results)
- Return type:
- __call__(scores, target, attribute, groups, favorable_target, *, min_score=None, max_score=None, prefer_high_scores=True)#
Bias computation.
This method allows to use the bias metric as a function.
- Parameters:
scores (ArrayLike) – A list of scores
target (ArrayLike) – The binary target values. Must have the same length as scores.
attribute (ndarray) – The protected attribute. Must have the same length as scores.
groups (list) – A list of groups. Each group is given by a value of the protected attribute. A value of None is used to define a group with all elements that are not in another group.
favorable_target (str or int) – The favorable outcome
min_score (float) – The minimal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling.
max_score (float) – The maximal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling
prefer_high_scores (bool, optional) – Specify whether high scores or low scores are favorable.
- Returns:
bias – The computed bias.
- Return type:
float
Notes
This method offers fewer parameters than
bias(), because not all will affect the pure bias value.
- class fairscoring.metrics.integral.IntegralBiasResult(bias, pos, neg, cdf_x, cdfs)#
Bases:
fairscoring.metrics.base.TwoGroupBiasResultAn extended bias result that also stores groupwise cumulative distribution functions (cdfs).
- Parameters:
bias (float) – The bias value
pos (float) – The positive component of the bias
neg (float) – The negative component of the bias
cdf_x (ArrayLike) – x-values at which the cdfs are stored. This array is 1-dimensional
cdfs (List of ArrayLike) – A list of cdfs.
- Variables:
bias (float) – The bias value
pos (float) – The positive component of the bias
neg (float) – The negative component of the bias
cdf_x (ArrayLike) – x-values at which the cdfs are stored. This array is 1-dimensional
cdfs (List of ArrayLike) – A list of cdfs.
- property pos_component#
- Proportion of the positive component in the total bias
- Return type:
Proportion of the positive component in the total bias
- property neg_component#
- Proportion of the negative component in the total bias
- Return type:
Proportion of the negative component in the total bias
- class fairscoring.metrics.integral.WassersteinMetric(fairness_type, name, score_transform=None, p=1)#
Bases:
IntegralMetricA metric that measures the differences between distributions using the Wasserstein Distance [BeDB24].
This metric can be used to measure independence and separation bias. The fairness_type-parameter specifies which bias to measure and hence which distribution will be compared.
This metric returns a
IntegralBiasResultobject, which allows to split the bias in positive and negative parts. Furthermore, it stores the cumulative distribution functions of the groups.- Parameters:
fairness_type ({"IND", "EO", "PE"}) – Specifies the type of fairness that is measured. Accepted values are: 1. “IND” (Independence), 2. “EO” (Equal Opportunity), 3. “PE” (Predictive Equality),
name (str) – Name of the Metric
score_transform ({"rescale","quantile",None}) –
A transformation of the scores prior to the bias computation. There are two supported methods:
rescaling (to the interval [0,1]. In this case, the
bias()method can take min and max scores.quantile transformation. This leads to standardized bias measures.
p (float, default=1) – Exponent for the Wasserstein Distance. Use the default of 1 to get the Earthmover Distance
- bias(scores, target, attribute, groups, favorable_target, *, min_score=None, max_score=None, n_permute=None, seed=None, prefer_high_scores=True)#
Bias computation
- Parameters:
scores (ArrayLike) – A list of scores
target (ArrayLike) – The binary target values. Must have the same length as scores.
attribute (ndarray) – The protected attribute. Must have the same length as scores.
groups (list) – A list of groups. Each group is given by a value of the protected attribute. A value of None is used to define a group with all elements that are not in another group.
favorable_target (str or int) – The favorable outcome
min_score (float) – The minimal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling.
max_score (float) – The maximal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling
n_permute (int, optional) – Number of iterations for the permutation test. Permutation tests are only performed if this value is >0.
prefer_high_scores (bool, optional) – Specify whether high scores or low scores are favorable.
seed (int, optional) – Random seed for the permutation test. Only required if the result need to be 100% reproducible.
- Returns:
bias – The computed bias (including intermediate results)
- Return type:
- __call__(scores, target, attribute, groups, favorable_target, *, min_score=None, max_score=None, prefer_high_scores=True)#
Bias computation.
This method allows to use the bias metric as a function.
- Parameters:
scores (ArrayLike) – A list of scores
target (ArrayLike) – The binary target values. Must have the same length as scores.
attribute (ndarray) – The protected attribute. Must have the same length as scores.
groups (list) – A list of groups. Each group is given by a value of the protected attribute. A value of None is used to define a group with all elements that are not in another group.
favorable_target (str or int) – The favorable outcome
min_score (float) – The minimal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling.
max_score (float) – The maximal score. This might influence the bias computation, e.g. by defining the integral bounds. This is also used for rescaling
prefer_high_scores (bool, optional) – Specify whether high scores or low scores are favorable.
- Returns:
bias – The computed bias.
- Return type:
float
Notes
This method offers fewer parameters than
bias(), because not all will affect the pure bias value.