RV

class hyppo.independence.RV

Rank Value (RV) test statistic and p-value.

RV is the multivariate generalization of the squared Pearson correlation coefficient [1]. The RV coefficient can be thought to be closely related to principal component analysis (PCA), canonical correlation analysis (CCA), multivariate regression, and statistical classification [1]. The statistic can be derived as follows [1] [2]:

Let x and y be (n,p) samples of random variables X and Y. We can center x and y and then calculate the sample covariance matrix ˆΣxy=xTy and the variance matrices for x and y are defined similarly. Then, the RV test statistic is found by calculating

RVn(x,y)=tr(ˆΣxyˆΣyx)tr(ˆΣ2xx)tr(ˆΣ2yy)

where tr() is the trace operator.

The p-value returned is calculated using a permutation test using hyppo.tools.perm_test.

Methods Summary

RV.statistic(x, y)

Helper function that calculates the RV test statistic.

RV.test(x, y[, reps, workers])

Calculates the RV test statistic and p-value.


RV.statistic(x, y)

Helper function that calculates the RV test statistic.

Parameters

x,y (ndarray) -- Input data matrices. x and y must have the same number of samples and dimensions. That is, the shapes must be (n, p) where n is the number of samples and p is the number of dimensions.

Returns

stat (float) -- The computed RV statistic.

RV.test(x, y, reps=1000, workers=1)

Calculates the RV test statistic and p-value.

Parameters
  • x,y (ndarray) -- Input data matrices. x and y must have the same number of samples and dimensions. That is, the shapes must be (n, p) where n is the number of samples and p is the number of dimensions.

  • reps (int, default: 1000) -- The number of replications used to estimate the null distribution when using the permutation test used to calculate the p-value.

  • workers (int, default: 1) -- The number of cores to parallelize the p-value computation over. Supply -1 to use all cores available to the Process.

Returns

  • stat (float) -- The computed RV statistic.

  • pvalue (float) -- The computed RV p-value.

Examples

>>>
>>> import numpy as np
>>> from hyppo.independence import RV
>>> x = np.arange(7)
>>> y = x
>>> stat, pvalue = RV().test(x, y)
>>> '%.1f, %.2f' % (stat, pvalue)
'1.0, 0.00'

Examples using hyppo.independence.RV