The ability to measure gene expression levels for individual cells (vs. pools of cells) is crucial to address many important biological questions, such as the study of stem cell differentiation, the detection of rare mutations in cancer, or the discovery of cellular subtypes in the brain. Single-cell transcriptome sequencing (RNA-Seq) allows the high-throughput measurement of gene expression levels for entire genomes at the resolution of single cells. RNA-Seq studies provide a great example of the range of questions one encounters in a Data Science workflow, where the data are complex in a variety of ways, there are multiple analysis steps, and drawing on rigorous statistical principles and methods is essential to derive reliable and interpretable biological results. In this talk, I will provide a survey of statistical questions related to the analysis of single-cell RNA-Seq data to investigate the differentiation of stem cells in the brain, including, exploratory data analysis, dimensionality reduction, normalization, expression quantitation, cluster analysis, and the inference of cellular lineages.
Please join the event.
About Sandrine Dudoit
Professor Dudoit’s methodological research interests regard high-dimensional statistical learning and include exploratory data analysis (EDA), visualization, loss-based estimation with cross-validation (e.g., density estimation, classification, regression, model selection), and multiple hypothesis testing. Much of her methodological work is motivated by statistical questions arising in biological research and, in particular, the design and analysis of high-throughput sequencing studies, e.g., single-cell transcriptome sequencing (RNA-Seq) for discovering novel cell types and for the study of stem cell differentiation. Her contributions include: exploratory data analysis, normalization and expression quantitation, differential expression analysis, class discovery and prediction, inference of cell lineages, and the integration of biological annotation metadata (e.g., Gene Ontology (GO) annotation). She is also interested in statistical computing and, in particular, computationally reproducible research. She is a founding core developer of the Bioconductor Project, an open-source and open-development software project for the analysis of biomedical and genomic data.