Hypothesis Tests¶
Unpaired k-Sample Transform¶
-
mgcpy.hypothesis_tests.transforms.k_sample_transform(x, y, is_y_categorical=False)[source]¶ Transform to represent a k-sample test as an independence test
- Parameters
X (2D numpy.array) --
is interpreted as either:
a
[n*n]distance matrix, a square matrix with zeros on diagonal for n samples ORa
[n*p]data matrix, a matrix with n samples in p dimensions
Y (2D numpy.array) --
is interpreted as either:
a
[n*n]distance matrix, a square matrix with zeros on diagonal for n samples ORa
[n*p]data matrix, a matrix with n samples in p dimensionsa
[n*1]label matrix, categorical data for X, ifis_y_categoricalis set to True
is_y_categorical (boolean) -- if set to True,
Yhas categorical data ans is a labels array for X, else, it is a plain data matrix
- Returns
- u
a concatenated data matrix of dimensions
[2*n, p]
- v
a label matrix for
u, which indicates to which category each data entry inubelongs to
- Return type
-
mgcpy.hypothesis_tests.transforms.paired_two_sample_transform(x, y)[source]¶ Transform to represent a paired two-sample test as an independence test Steps:
combine x and y to get the joint_distribution
sample n pairs from the joint_distribution
compute the eucledian distance between the sampled n pairs, which is
randomly_sampled_pairs_distancecompute the eucledian distance between the actual x and y, which is
actual_pairs_distancecompute the two sample transformed matrices of
randomly_sampled_pairs_distanceandactual_pairs_distance
- Parameters
X (2D numpy.array) -- is interpreted as either: - a
[n*n]distance matrix, a square matrix with zeros on diagonal for n samples OR - a[n*p]data matrix, a matrix with n samples in p dimensionsY (2D numpy.array) -- is interpreted as either: - a
[n*n]distance matrix, a square matrix with zeros on diagonal for n samples OR - a[n*p]data matrix, a matrix with n samples in p dimensions
- Returns
- u
a data matrix of dimensions
[2*n, p]
- v
a label matrix for
u, which indicates to which category each data entry inubelongs to
- Return type
-
mgcpy.hypothesis_tests.transforms.paired_two_sample_test_dcorr(x, y, which_test='biased', compute_distance_matrix=None, is_fast=False)[source]¶ Compute paired two sample test's DCorr test_statistic
- Parameters
X (2D numpy.array) --
is interpreted as either:
a
[n*n]distance matrix, a square matrix with zeros on diagonal for n samples ORa
[n*p]data matrix, a matrix with n samples in p dimensions
Y (2D numpy.array) --
is interpreted as either:
a
[n*n]distance matrix, a square matrix with zeros on diagonal for n samples ORa
[n*p]data matrix, a matrix with n samples in p dimensions
- Returns
paired two sample DCorr test_statistic
- Return type