stats

DeLong Test — Paired ROC

Compare AUCs of two classifiers on the same subjects — accounts for correlation via covariance term.

Format: one subject per line — label,scoreA,scoreB
Label column can be first or last. Same subjects tested by both classifiers.
Pairing accounts for correlation between classifiers, increasing power over the independent test.

Input — Paired Scores

All subjects (label,scoreA,scoreB per line)

Results

n (pos / neg)—

AUC_A—

AUC_B—

Var(AUC_A) / Var(AUC_B)—

Cov(AUC_A, AUC_B)—

AUC_A − AUC_B—

Var(diff) = VarA+VarB−2Cov—

SE of difference—

Z-statistic—

p-value (two-tailed)—

95% CI for AUC_A − AUC_B—

Conclusion α = 0.05—

Paired: Var(A−B) = Var(A) + Var(B) − 2·Cov(A,B)
Cov = s₁₀_AB/n + s₀₁_AB/m · cross-placement covariance
Pairing reduces SE when classifiers are correlated → higher power