Package parsimony :: Package utils :: Module stats
[hide private]
[frames] | no frames]

Module stats

source code

Created on Tue Jul 1 16:30:38 2014

Copyright (c) 2013-2014, CEA/DSV/I2BM/Neurospin. All rights reserved.


Author: Tommy Löfstedt

License: BSD 3-clause.

Functions [hide private]
 
multivariate_normal(mu, Sigma, n=1)
Generates n random vectors from the multivariate normal distribution with mean mu and covariance matrix Sigma.
source code
 
sensitivity(cond, test)
A test's ability to identify a condition correctly.
source code
 
specificity(cond, test)
A test's ability to exclude a condition correctly.
source code
 
ppv(cond, test)
A test's ability to correctly identify positive outcomes.
source code
 
precision(*args, **kwargs) source code
 
npv(cond, test)
A test's ability to correctly identify negative outcomes.
source code
 
accuracy(cond, test)
The degree of correctly estimated outcomes.
source code
 
F_score(cond, test)
A measure of a test's accuracy by a weighted average of the precision and sensitivity.
source code
 
alpha(cond, test)
False positive rate or type I error.
source code
 
beta(cond, test)
False negative rate or type II error.
source code
 
power(cond, test)
Statistical power for a test.
source code
 
likelihood_ratio_positive(cond, test)
Assesses the value of performing a diagnostic test for the positive outcome.
source code
 
likelihood_ratio_negative(cond, test)
Assesses the value of performing a diagnostic test for the negative outcome.
source code
 
fleiss_kappa(W, k)
Computes Fleiss' kappa for a set of variables classified into k categories by a number of different raters.
source code
Variables [hide private]
  __package__ = 'parsimony.utils'
Function Details [hide private]

multivariate_normal(mu, Sigma, n=1)

source code 
Generates n random vectors from the multivariate normal distribution
with mean mu and covariance matrix Sigma.

This function is faster (roughly 11 times faster for a 600-by-4000 matrix
on my computer) than numpy.random.multivariate_normal. This method differs
from  numpy's function in that it uses the Cholesky factorisation. Note
that this requires the covariance matrix to be positive definite, as
opposed to positive semi-definite in the numpy case.

See details at: https://en.wikipedia.org/wiki/
Multivariate_normal_distribution#Drawing_values_from_the_distribution

Parameters
----------
mu : Numpy array, shape (n, 1). The mean vector.

Sigma : Numpy array, shape (p, p). The covariance matrix.

n : Integer. The number of multivariate normal vectors to generate.

Example
-------
>>> import parsimony.utils.stats as stats
>>> import numpy as np
>>> np.random.seed(42)
>>> n, p = 50000, 100
>>> mu = np.random.rand(p, 1)
>>> alpha = 0.01
>>> Sigma = alpha * np.random.rand(p, p) + (1 - alpha) * np.eye(p, p)
>>> M = stats.multivariate_normal(mu, Sigma, n)
>>> mean = np.mean(M, axis=0)
>>> S = np.dot((M - mean).T, (M - mean)) * (1.0 / float(n - 1))
>>> round(np.linalg.norm(Sigma - S), 14)
0.51886218849785

sensitivity(cond, test)

source code 
A test's ability to identify a condition correctly.

Also called true positive rate or recall.

Parameters
----------
cond : Numpy array, boolean or 0/1 integer. The "true", known condition.

test : Numpy array, boolean or 0/1 integer. The estimated outcome.

Example
-------
>>> import parsimony.utils.stats as stats
>>> import numpy as np
>>> np.random.seed(42)
>>> p = 2030
>>> cond = np.zeros((p, 1))
>>> test = np.zeros((p, 1))
>>> stats.sensitivity(cond, test)
1.0
>>> stats.sensitivity(cond, np.logical_not(test))
1.0
>>> cond[:30] = 1.0
>>> test[:30] = 1.0
>>> test[:10] = 0.0
>>> test[-180:] = 1.0
>>> round(stats.sensitivity(cond, test), 2)
0.67

specificity(cond, test)

source code 
A test's ability to exclude a condition correctly.

Also called true negative rate.

Parameters
----------
cond : Numpy array, boolean or 0/1 integer. The "true", known condition.

test : Numpy array, boolean or 0/1 integer. The estimated outcome.

Example
-------
>>> import parsimony.utils.stats as stats
>>> import numpy as np
>>> np.random.seed(42)
>>> p = 2030
>>> cond = np.zeros((p, 1))
>>> test = np.zeros((p, 1))
>>> stats.specificity(cond, test)
1.0
>>> stats.specificity(cond, np.logical_not(test))
0.0
>>> cond[:30] = 1.0
>>> test[:30] = 1.0
>>> test[:10] = 0.0
>>> test[-180:] = 1.0
>>> round(stats.specificity(cond, test), 2)
0.91

ppv(cond, test)

source code 
A test's ability to correctly identify positive outcomes.

Short for positive predictive value. Also called precision.

Parameters
----------
cond : Numpy array, boolean or 0/1 integer. The "true", known condition.

test : Numpy array, boolean or 0/1 integer. The estimated outcome.

Example
-------
>>> import parsimony.utils.stats as stats
>>> import numpy as np
>>> np.random.seed(42)
>>> p = 2030
>>> cond = np.zeros((p, 1))
>>> test = np.zeros((p, 1))
>>> stats.ppv(cond, test)
1.0
>>> stats.ppv(cond, np.logical_not(test))
0.0
>>> cond[:30] = 1.0
>>> test[:30] = 1.0
>>> test[:10] = 0.0
>>> test[-180:] = 1.0
>>> round(stats.ppv(cond, test), 2)
0.1

precision(*args, **kwargs)

source code 
Decorators:
  • @deprecated("ppv")

npv(cond, test)

source code 
A test's ability to correctly identify negative outcomes.

The negative predictive value, NPV.

Parameters
----------
cond : Numpy array, boolean or 0/1 integer. The "true", known condition.

test : Numpy array, boolean or 0/1 integer. The estimated outcome.

Example
-------
>>> import parsimony.utils.stats as stats
>>> import numpy as np
>>> np.random.seed(42)
>>> p = 2030
>>> cond = np.zeros((p, 1))
>>> test = np.zeros((p, 1))
>>> stats.npv(cond, test)
1.0
>>> stats.npv(cond, np.logical_not(test))
1.0
>>> cond[:30] = 1.0
>>> test[:30] = 1.0
>>> test[:10] = 0.0
>>> test[-180:] = 1.0
>>> round(stats.npv(cond, test), 3)
0.995

accuracy(cond, test)

source code 
The degree of correctly estimated outcomes.

Parameters
----------
cond : Numpy array, boolean or 0/1 integer. The "true", known condition.

test : Numpy array, boolean or 0/1 integer. The estimated outcome.

Example
-------
>>> import parsimony.utils.stats as stats
>>> import numpy as np
>>> np.random.seed(42)
>>> p = 2030
>>> cond = np.zeros((p, 1))
>>> test = np.zeros((p, 1))
>>> stats.accuracy(cond, test)
1.0
>>> stats.accuracy(cond, np.logical_not(test))
0.0
>>> cond[:30] = 1.0
>>> test[:30] = 1.0
>>> test[:10] = 0.0
>>> test[-180:] = 1.0
>>> round(stats.accuracy(cond, test), 2)
0.91

F_score(cond, test)

source code 
A measure of a test's accuracy by a weighted average of the precision
and sensitivity.

Also called the harmonic mean of precision and sensitivity.

Parameters
----------
cond : Numpy array, boolean or 0/1 integer. The "true", known condition.

test : Numpy array, boolean or 0/1 integer. The estimated outcome.

Example
-------
>>> import parsimony.utils.stats as stats
>>> import numpy as np
>>> np.random.seed(42)
>>> p = 2030
>>> cond = np.zeros((p, 1))
>>> test = np.zeros((p, 1))
>>> stats.F_score(cond, test)
1.0
>>> stats.F_score(cond, np.logical_not(test))
0.0
>>> cond[:30] = 1.0
>>> test[:30] = 1.0
>>> test[:10] = 0.0
>>> test[-180:] = 1.0
>>> round(stats.F_score(cond, test), 2)
0.17

alpha(cond, test)

source code 
False positive rate or type I error.

Parameters
----------
cond : Numpy array, boolean or 0/1 integer. The "true", known condition.

test : Numpy array, boolean or 0/1 integer. The estimated outcome.

Example
-------
>>> import parsimony.utils.stats as stats
>>> import numpy as np
>>> np.random.seed(42)
>>> p = 2030
>>> cond = np.zeros((p, 1))
>>> test = np.zeros((p, 1))
>>> stats.alpha(cond, test)
0.0
>>> stats.alpha(cond, np.logical_not(test))
1.0
>>> cond[:30] = 1.0
>>> test[:30] = 1.0
>>> test[:10] = 0.0
>>> test[-180:] = 1.0
>>> round(stats.alpha(cond, test), 2)
0.09

beta(cond, test)

source code 
False negative rate or type II error.

Parameters
----------
cond : Numpy array, boolean or 0/1 integer. The "true", known condition.

test : Numpy array, boolean or 0/1 integer. The estimated outcome.

Example
-------
>>> import parsimony.utils.stats as stats
>>> import numpy as np
>>> np.random.seed(42)
>>> p = 2030
>>> cond = np.zeros((p, 1))
>>> test = np.zeros((p, 1))
>>> stats.beta(cond, test)
0.0
>>> stats.beta(cond, np.logical_not(test))
0.0
>>> cond[:30] = 1.0
>>> test[:30] = 1.0
>>> test[:10] = 0.0
>>> test[-180:] = 1.0
>>> round(stats.beta(cond, test), 2)
0.33

power(cond, test)

source code 
Statistical power for a test. The probability that it correctly rejects
the null hypothesis when the null hypothesis is false.

Parameters
----------
cond : Numpy array, boolean or 0/1 integer. The "true", known condition.

test : Numpy array, boolean or 0/1 integer. The estimated outcome.

Example
-------
>>> import parsimony.utils.stats as stats
>>> import numpy as np
>>> np.random.seed(42)
>>> p = 2030
>>> cond = np.zeros((p, 1))
>>> test = np.zeros((p, 1))
>>> stats.power(cond, test)
1.0
>>> stats.power(cond, np.logical_not(test))
1.0
>>> cond[:30] = 1.0
>>> test[:30] = 1.0
>>> test[:10] = 0.0
>>> test[-180:] = 1.0
>>> round(stats.power(cond, test), 2)
0.67

likelihood_ratio_positive(cond, test)

source code 
Assesses the value of performing a diagnostic test for the positive
outcome.

Parameters
----------
cond : Numpy array, boolean or 0/1 integer. The "true", known condition.

test : Numpy array, boolean or 0/1 integer. The estimated outcome.

Example
-------
>>> import parsimony.utils.stats as stats
>>> import numpy as np
>>> np.random.seed(42)
>>> p = 2030
>>> cond = np.zeros((p, 1))
>>> test = np.zeros((p, 1))
>>> stats.likelihood_ratio_positive(cond, test)
inf
>>> stats.likelihood_ratio_positive(cond, np.logical_not(test))
1.0
>>> cond[:30] = 1.0
>>> test[:30] = 1.0
>>> test[:10] = 0.0
>>> test[-180:] = 1.0
>>> round(stats.likelihood_ratio_positive(cond, test), 1)
7.4

likelihood_ratio_negative(cond, test)

source code 
Assesses the value of performing a diagnostic test for the negative
outcome.

Parameters
----------
cond : Numpy array, boolean or 0/1 integer. The "true", known condition.

test : Numpy array, boolean or 0/1 integer. The estimated outcome.

Example
-------
>>> import parsimony.utils.stats as stats
>>> import numpy as np
>>> np.random.seed(42)
>>> p = 2030
>>> cond = np.zeros((p, 1))
>>> test = np.zeros((p, 1))
>>> stats.likelihood_ratio_negative(cond, test)
0.0
>>> stats.likelihood_ratio_negative(cond, np.logical_not(test))
inf
>>> cond[:30] = 1.0
>>> test[:30] = 1.0
>>> test[:10] = 0.0
>>> test[-180:] = 1.0
>>> round(stats.likelihood_ratio_negative(cond, test), 2)
0.37

fleiss_kappa(W, k)

source code 

Computes Fleiss' kappa for a set of variables classified into k categories by a number of different raters.

W is a matrix with shape (variables, raters) with k categories between 0, ..., k - 1.