The University of Arizona

Treatment Allocations Based on Multi-Armed Bandit Strategies

Treatment Allocations Based on Multi-Armed Bandit Strategies

Series: Statistics GIDP Colloquium
Location: ENR2 S395
Presenter: Yuhong Yang, University of Minnesota

In practice of medicine, multiple treatments are often available to treat individual patients. The task of identifying the best treatment for a specific patient is very challenging due to patient inhomogeneity. Multi-armed bandit with covariates provides a framework for designing effective treatment allocation rules in a way that integrates the learning from experimentation with maximizing the benefits to the patients along the process.
In this talk, we present new strategies to achieve asymptotically efficient or minimax optimal treatment allocations. Since many nonparametric and parametric methods in supervised learning may be applied to estimating the mean treatment outcome functions (in terms of the covariates) but guidance on how to choose among them is generally unavailable, we propose a model combining allocation strategy for adaptive performance and show its strong consistency. When the mean treatment outcome functions are smooth, rates of convergence can be studied to quantify the effectiveness of a treatment allocation rule in terms of the overall benefits the patients have received.  A multi-stage randomized allocation with arm elimination algorithm is proposed to combine the flexibility in treatment outcome function modeling and a theoretical guarantee of the overall treatment benefits. Numerical results are given to demonstrate the performance of the new strategies.
The talk is based on joint work with Wei Qian

(Refreshments will be served.)

Department of Mathematics, The University of Arizona 617 N. Santa Rita Ave. P.O. Box 210089 Tucson, AZ 85721-0089 USA Voice: (520) 621-6892 Fax: (520) 621-8322 Contact Us © Copyright 2018 Arizona Board of Regents All rights reserved