RV¶
-
class
hyppo.independence.
RV
¶ Rank Value (RV) test statistic and p-value.
RV is the multivariate generalization of the squared Pearson correlation coefficient [1]. The RV coefficient can be thought to be closely related to principal component analysis (PCA), canonical correlation analysis (CCA), multivariate regression, and statistical classification [1]. The statistic can be derived as follows [1] [2]:
Let x and y be (n,p) samples of random variables X and Y. We can center x and y and then calculate the sample covariance matrix ˆΣxy=xTy and the variance matrices for x and y are defined similarly. Then, the RV test statistic is found by calculating
RVn(x,y)=tr(ˆΣxyˆΣyx)tr(ˆΣ2xx)tr(ˆΣ2yy)where tr(⋅) is the trace operator.
The p-value returned is calculated using a permutation test using
hyppo.tools.perm_test
.
Methods Summary
|
Helper function that calculates the RV test statistic. |
|
Calculates the RV test statistic and p-value. |
-
RV.
statistic
(x, y)¶ Helper function that calculates the RV test statistic.
-
RV.
test
(x, y, reps=1000, workers=1)¶ Calculates the RV test statistic and p-value.
- Parameters
x,y (
ndarray
) -- Input data matrices.x
andy
must have the same number of samples and dimensions. That is, the shapes must be(n, p)
where n is the number of samples and p is the number of dimensions.reps (
int
, default:1000
) -- The number of replications used to estimate the null distribution when using the permutation test used to calculate the p-value.workers (
int
, default:1
) -- The number of cores to parallelize the p-value computation over. Supply-1
to use all cores available to the Process.
- Returns
Examples
>>> import numpy as np >>> from hyppo.independence import RV >>> x = np.arange(7) >>> y = x >>> stat, pvalue = RV().test(x, y) >>> '%.1f, %.2f' % (stat, pvalue) '1.0, 0.00'