MetaGen: Reference-Free Learning with Multiple Metagenomic Samples
Abstract: A major goal of metagenomics is to identify and study the entire collection of microbial species in a set of targeted samples. In this talk, I will present a novel statistical metagenomic algorithm that simultaneously identifies microbial species and estimates their abundances without using reference genomes. Compared to reference-free methods based primarily on k-mer distributions or coverage information, the proposed approach achieves a higher species binning accuracy and is particularly powerful when sequencing coverage is low. I will demonstrate the performance of this new method through both simulation and real metagenomic studies. The MetaGen software is available at https://github.com/BioAlgs/MetaGen(link is external).
(Refreshments will be served.)