The University of Arizona

Cross-Validation for Optimal and Reproducible Statistical Learning

Cross-Validation for Optimal and Reproducible Statistical Learning

Series: Tripods
Location: ENR2 S210
Presenter: Yuhong Yang, School of Statistics, University of Minnesota

In data mining and statistical learning, we frequently encounter the task of comparing different methods/algorithms to reach a  final choice for pure prediction or a scientific understanding/interpretation of a regression relationship. Cross-validation provides a powerful tool to address the matter. Unfortunately, there are seemingly widespread misconceptions on its use, which can lead to unreliable 
conclusions. In this talk, we will address the subtle issues involved and present results of minimax optimal regression learning and consistent selection of the best method for the data. In addition, we will propose proper cross-validation tools for model selection diagnostics that will cry foul at an impressive-looking but not really reproducible outcome from a sparse-pattern-hunting method in the wild west of learning with
a huge number of covariates.

Department of Mathematics, The University of Arizona 617 N. Santa Rita Ave. P.O. Box 210089 Tucson, AZ 85721-0089 USA Voice: (520) 621-6892 Fax: (520) 621-8322 Contact Us © Copyright 2018 Arizona Board of Regents All rights reserved