How does one convert a Z-score from the Z-distribution (standard normal distribution, Gaussian distribution) to a p-value? I have yet to find the magical function in Scipy's stats
module to do this, but one must be there.
问题:
回答1:
I like the survival function (upper tail probability) of the normal distribution a bit better, because the function name is more informative:
p_values = scipy.stats.norm.sf(abs(z_scores)) #one-sided
p_values = scipy.stats.norm.sf(abs(z_scores))*2 #twosided
normal distribution "norm" is one of around 90 distributions in scipy.stats
norm.sf also calls the corresponding function in scipy.special as in gotgenes example
small advantage of survival function, sf: numerical precision should better for quantiles close to 1 than using the cdf
回答2:
I think the cumulative distribution function (cdf) is preferred to the survivor function. The survivor function is defined as 1-cdf, and may communicate improperly the assumptions the language model uses for directional percentiles. Also, the percentage point function (ppf) is the inverse of the cdf, which is very convenient.
>>> import scipy.stats as st
>>> st.norm.ppf(.95)
1.6448536269514722
>>> st.norm.cdf(1.64)
0.94949741652589625
回答3:
Aha! I found it: scipy.special.ndtr
! This also appears to be under scipy.stats.stats.zprob
as well (which is just a pointer to ndtr
).
Specifically, given a one-dimensional numpy.array
instance z_scores
, one can obtain the p-values as
p_values = 1 - scipy.special.ndtr(z_scores)
or alternatively
p_values = scipy.special.ndtr(-z_scores)
回答4:
From formula:
import numpy as np
import scipy.special as scsp
def z2p(z):
"""From z-score return p-value."""
return 0.5 * (1 + scsp.erf(z / np.sqrt(2)))
回答5:
p_value = scipy.stats.norm.pdf(abs(z_score_max)) #one-sided test
p_value = scipy.stats.norm.pdf(abs(z_score_max))*2 # two - sided test
The probability density function (pdf) function in python yields values p-values that are drawn from a z-score table in a intro/AP stats book.