Divide and Conquer in Statistical Estimation
In large data applications, it is often impossible for a single machine to store and operate on all the data. Distributed computing operates in the setting where the samples are split among machines. Previous approaches to this setting involved algorithms where the machines must communicate during each iteration of an algorithm. Divide and conquer allows statistical estimation to occur using only one round of communication between the machines. We discuss recent advances in theoretical guarantees of statistical consistency, more clever approaches, and extensions to nonparametric settings.
(Bagels and refreshments will be served.)