Calculates power discrepancy, a class of goodness-of-fit tests as a measure of discrepancy between observed and expected data.
This contains several goodness-of-fit tests as special cases, see the describtion of lambd, the exponent of the power discrepancy. The pvalue is based on the asymptotic chi-square distribution of the test statistic.
freeman_tukey: D(x| heta) = sum_j (sqrt{x_j} - sqrt{e_j})^2
Parameters: | o : Iterable
e : Iterable
lambd : float or string
axis : int
ddof : int
|
---|---|
Returns: | D_obs : Discrepancy of observed values pvalue : pvalue |
References
Examples
>>> observed = np.array([ 2., 4., 2., 1., 1.])
>>> expected = np.array([ 0.2, 0.2, 0.2, 0.2, 0.2])
for checking correct dimension with multiple series
>>> powerdiscrepancy(np.column_stack((observed,observed)).T, 10*expected, lambd='freeman_tukey',axis=1)
(array([[ 2.745166, 2.745166]]), array([[ 0.6013346, 0.6013346]]))
>>> powerdiscrepancy(np.column_stack((observed,observed)).T, 10*expected,axis=1)
(array([[ 2.77258872, 2.77258872]]), array([[ 0.59657359, 0.59657359]]))
>>> powerdiscrepancy(np.column_stack((observed,observed)).T, 10*expected, lambd=0,axis=1)
(array([[ 2.77258872, 2.77258872]]), array([[ 0.59657359, 0.59657359]]))
>>> powerdiscrepancy(np.column_stack((observed,observed)).T, 10*expected, lambd=1,axis=1)
(array([[ 3., 3.]]), array([[ 0.5578254, 0.5578254]]))
>>> powerdiscrepancy(np.column_stack((observed,observed)).T, 10*expected, lambd=2/3.0,axis=1)
(array([[ 2.89714546, 2.89714546]]), array([[ 0.57518277, 0.57518277]]))
>>> powerdiscrepancy(np.column_stack((observed,observed)).T, expected, lambd=2/3.0,axis=1)
(array([[ 2.89714546, 2.89714546]]), array([[ 0.57518277, 0.57518277]]))
>>> powerdiscrepancy(np.column_stack((observed,observed)), expected, lambd=2/3.0, axis=0)
(array([[ 2.89714546, 2.89714546]]), array([[ 0.57518277, 0.57518277]]))
each random variable can have different total count/sum
>>> powerdiscrepancy(np.column_stack((observed,2*observed)), expected, lambd=2/3.0, axis=0)
(array([[ 2.89714546, 5.79429093]]), array([[ 0.57518277, 0.21504648]]))
>>> powerdiscrepancy(np.column_stack((observed,2*observed)), expected, lambd=2/3.0, axis=0)
(array([[ 2.89714546, 5.79429093]]), array([[ 0.57518277, 0.21504648]]))
>>> powerdiscrepancy(np.column_stack((2*observed,2*observed)), expected, lambd=2/3.0, axis=0)
(array([[ 5.79429093, 5.79429093]]), array([[ 0.21504648, 0.21504648]]))
>>> powerdiscrepancy(np.column_stack((2*observed,2*observed)), 20*expected, lambd=2/3.0, axis=0)
(array([[ 5.79429093, 5.79429093]]), array([[ 0.21504648, 0.21504648]]))
>>> powerdiscrepancy(np.column_stack((observed,2*observed)), np.column_stack((10*expected,20*expected)), lambd=2/3.0, axis=0)
(array([[ 2.89714546, 5.79429093]]), array([[ 0.57518277, 0.21504648]]))
>>> powerdiscrepancy(np.column_stack((observed,2*observed)), np.column_stack((10*expected,20*expected)), lambd=-1, axis=0)
(array([[ 2.77258872, 5.54517744]]), array([[ 0.59657359, 0.2357868 ]]))