Monday, June 23, 2014

Basic of Statistic and Machine Learning used for Analyze Data (2)

Non-Parametric Test:

Non-parametric covers techniques that do not rely on data belonging to any particular distribution.

Non-parametric covers techniques that do not assume that the structure of a model is fixed.

Mann-Whitney U Test:


Is a non parametric test of the null hypothesis that two populations are the same against an alternative hypothesis, especially that a particular population tends to have larger values than the other.




scipy.stats.mannwhitneyu

scipy.stats.mannwhitneyu(xyuse_continuity=True)]
Computes the Mann-Whitney rank test on samples x and y.
Parameters:
x, y : array_like
Array of samples, should be one-dimensional.
use_continuity : bool, optional
Whether a continuity correction (1/2.) should be taken into account. Default is True.
Returns:
u : float
The Mann-Whitney statistics.
prob : float
One-sided p-value assuming a asymptotic normal distribution.

If you need test the data is Normal distribution or not you can use:

scipy.stats.shapiro

scipy.stats.shapiro(xa=Nonereta=False)
Perform the Shapiro-Wilk test for normality.
The Shapiro-Wilk test tests the null hypothesis that the data was drawn from a normal distribution.
Parameters:
x : array_like
Array of sample data.
a : array_like, optional
Array of internal parameters used in the calculation. If these are not given, they will be computed internally. If x has length n, then a must have length n/2.
reta : bool, optional
Whether or not to return the internally computed a values. The default is False.
Returns:
W : float
The test statistic.
p-value : float
The p-value for the hypothesis test.
a : array_like, optional
If reta is True, then these are the internally computed “a” values that may be passed into this function on future calls.

No comments:

Post a Comment