51ÊÓÆµ

Mathematics and Statistics Vol. 13(6), pp. 486 - 497
DOI: 10.13189/ms.2025.130605
Reprint (PDF) (484Kb)


Adaptive Keep-Ratio Selection in Partial-Sample Regression via Nested Cross-Validation


Aarhus M. Dela Cruz *
College of Science, Bulacan State University, City of Malolos, Bulacan, Philippines

ABSTRACT

Selecting the right neighborhood size is crucial for nonparametric regression, especially when data conditions shift. We study two automated rules for Partial-Sample Regression (PSR), which predicts by averaging only the most relevant observations. The first rule, Fit-Max, minimizes training error, while the second, Risk-Min, minimizes cross-validated error. To avoid biased estimates, both are embedded in a nested cross-validation design that separates tuning from final testing. On synthetic data with regime changes, we find a clear U-shaped error curve with an optimum when keeping only 5–7% of the data, cutting mean squared error by more than 45% compared with a fixed 50% rule. Fit-Max and Risk-Min perform almost identically out of sample, with a small bootstrap edge for Fit-Max. On benchmark datasets, adaptive PSR consistently reduced error relative to the fixed rule, with gains ranging from about 17% in moderate settings to over 80% in more complex cases. Compared with classical methods, PSR achieved accuracy competitive with -nearest neighbors and Nadaraya–Watson kernel regression. These findings establish adaptive keep-ratio selection as a simple, reproducible, and effective strategy for relevance-based regression.

KEYWORDS
Bias-Variance Trade-Off, Hyperparameter Tuning, Nested Cross-Validation, Partial-Sample Regression, Relevance-Based Prediction

Cite This Paper in IEEE or APA Citation Styles
(a). IEEE Format:
[1] Aarhus M. Dela Cruz , "Adaptive Keep-Ratio Selection in Partial-Sample Regression via Nested Cross-Validation," Mathematics and Statistics, Vol. 13, No. 6, pp. 486 - 497, 2025. DOI: 10.13189/ms.2025.130605.

(b). APA Format:
Aarhus M. Dela Cruz (2025). Adaptive Keep-Ratio Selection in Partial-Sample Regression via Nested Cross-Validation. Mathematics and Statistics, 13(6), 486 - 497. DOI: 10.13189/ms.2025.130605.