Skip to content

Validation

Permutation testing for OPLS model significance.

PermutationResult dataclass

PermutationResult(
    r2y: float,
    q2: float,
    permuted_r2y: NDArray[float64],
    permuted_q2: NDArray[float64],
    r2y_p_value: float,
    q2_p_value: float,
)

Outcome of :func:permutation_test.

Attributes:

Name Type Description
r2y, q2 float

Observed metrics on the real labels.

permuted_r2y, permuted_q2 ndarray

Metrics obtained on permuted labels.

r2y_p_value, q2_p_value float

Empirical p-values (1 + #{permuted >= observed}) / (n_permutations + 1).

permutation_test

permutation_test(
    estimator: BaseEstimator,
    X: ArrayLike,
    y: ArrayLike,
    n_permutations: int = 20,
    cv: _CVType = None,
    random_state: int | RandomState | None = None,
    n_jobs: int | None = None,
) -> PermutationResult

Assess significance of an OPLS regression model by permuting y.

.. warning:: This function is intended for OPLS regression models only. Classifiers like :class:~scikit_opls.OPLSDA are not supported.

The estimator must expose r2y_ (or best_estimator_.r2y_) after fitting.

Parameters:

Name Type Description Default
estimator object

An unfitted OPLS-like estimator (cloned internally for each fit).

required
X array-like of shape (n_samples, n_features)

Predictors.

required
y array-like of shape (n_samples,)

Response.

required
n_permutations int

Number of label permutations.

20
cv int, cross-validation generator or None

Determines the cross-validation splitting strategy. None uses the estimator's cv parameter if present, or defaults to min(5, n_samples).

None
random_state int, RandomState instance or None

Determines random number generation for label permutation.

None
n_jobs int or None

Number of jobs running the independent permutations in parallel via :class:joblib.Parallel. None means 1; -1 uses all processors. Permutations are drawn up front from the seeded RNG, so results are reproducible regardless of n_jobs.

None

Returns:

Name Type Description
result PermutationResult

Observed and permuted R2Y/Q2 with empirical p-values.

Notes

random_state controls only the label permutations. If cv is a randomised splitter (e.g. ShuffleSplit without its own random_state), repeated calls can differ even with a fixed random_state here — set random_state on the splitter itself for full reproducibility.

When estimator is a GridSearchCV with cv=None, its inner CV still defaults to 5-fold; for n_samples < 5 set the GridSearchCV cv explicitly (this function does not rewrite a user's inner CV).

options: members: - permutation_test - PermutationResult