Innovative technologies now allow us to probe the genome in more dimensions and at higher resolution than ever before, providing a wealth of information for studying the genomic basis of complex traits. However, meaningful biological insights are often masked by technical artifacts, systematic biases, or low signal-to-noise ratio (“needle in a haystack”). These challenges demand tailored statistical methodology in order to unlock the full potential of emerging assays.

My research group focuses on developing novel frameworks and rigorous inferential procedures that exploit the increased scope and scale of high-throughput sequencing data, with the ultimate goal of uncovering new molecular signals in cancer, child health, and development.


vmrseq: Probabilistic Modeling of Single-cell Methylation Heterogeneity
Ning Shen and Keegan Korthauer
DOI: 10.1101/2023.11.20.567911

Reversal of viral and epigenetic HLA class I repression in Merkel cell carcinoma
Journal of Clinical Investigation
Patrick C. Lee and Susan Klaeger and Phuong M. Le and Keegan Korthauer and Jingwei Cheng and Varsha Ananthapadmanabhan and Thomas C. Frost and Jonathan D. Stevens and Alan Y.L. Wong and J. Bryan Iorgulescu and Anna Y. Tarren and Vipheaviny A. Chea and Isabel P. Carulli and Camilla K. Lemvigh and Christina B. Pedersen and Ashley K. Gartin and Siranush Sarkizova and Kyle T. Wright and Letitia W. Li and Jason Nomburg and Shuqiang Li and Teddy Huang and Xiaoxi Liu and Lucas Pomerance and Laura M. Doherty and Annie M. Apffel and Luke J. Wallace and Suzanna Rachimi and Kristen D. Felt and Jacquelyn O. Wolff and Elizabeth Witten and Wandi Zhang and Donna Neuberg and William J. Lane and Guanglan Zhang and Lars R. Olsen and Manisha Thakuria and Scott J. Rodig and Karl R. Clauser and Gabriel J. Starrett and John G. Doench and Sara J. Buhrlage and Steven A. Carr and James A. DeCaprio and Catherine J. Wu and Derin B. Keskin
DOI: 10.1172/JCI151666

Detecting Neuroendocrine Prostate Cancer Through Tissue-Informed Cell-Free DNA Methylation Analysis
Clinical Cancer Research
Berchuck, J.E. and Baca, S.C. and McClure, H.M. and Korthauer, K. and Tsai, H.K. and Nuzzo, P.V. and Kelleher, K.M. and He, M. and Steinharter, J.A. and Zacharia, S. and Spisak, S. and Seo, J.-H. and Conteduca, V. and Elemento, O. and Auh, J. and Sigouros, M. and Corey, E. and Hirsch, M.S. and Taplin, M.-E. and Choueiri, T.K. and Pomerantz, M.M. and Beltran, H. and Freedman, M.L.
DOI: 10.1158/1078-0432.CCR-21-3762

Differential substrate use in EGF- and oncogenic KRAS-stimulated human mammary epithelial cells.
The FEBS journal
Keibler MA and Dong W and Korthauer KD and Hosios AM and Moon SJ and Sullivan LB and Liu N and Abbott KL and Arevalo OD and Ho K and Lee J and Phanse AS and Kelleher JK and Iliopoulos O and Stephanopoulos G
DOI: 10.1111/febs.15858
PubMed: 33811729

Androgen receptor and MYC equilibration centralizes on developmental super-enhancer
Nature Communications
Guo, H. and Wu, Y. and Nouri, M. and Spisak, S. and Russo, J.W. and Sowalsky, A.G. and Pomerantz, M.M. and Wei, Z. and Korthauer, K. and Seo, J.-H. and Wang, L. and Arai, S. and Freedman, M.L. and He, H.H. and Chen, S. and Balk, S.P.
DOI: 10.1038/s41467-021-27077-y

CDK4/6 inhibition reprograms the breast cancer enhancer landscape by stimulating AP-1 transcriptional activity
Nature Cancer
Watt, A.C. and Cejas, P. and DeCristo, M.J. and Metzger-Filho, O. and Lam, E.Y.N. and Qiu, X. and BrinJones, H. and Kesten, N. and Coulson, R. and Font-Tello, A. and Lim, K. and Vadhi, R. and Daniels, V.W. and Montero, J. and Taing, L. and Meyer, C.A. and Gilan, O. and Bell, C.C. and Korthauer, K.D. and Giambartolomei, C. and Pasaniuc, B. and Seo, J.-H. and Freedman, M.L. and Ma, C. and Ellis, M.J. and Krop, I. and Winer, E. and Letai, A. and Brown, M. and Dawson, M.A. and Long, H.W. and Zhao, J.J. and Goel, S.
DOI: 10.1038/s43018-020-00135-y

A compositional model to assess expression changes from single-cell rna-seq data
Annals of Applied Statistics
Ma, X. and Korthauer, K. and Kendziorski, C. and Newton, M.A.
DOI: 10.1214/20-AOAS1423

Reprogramming of the FOXA1 cistrome in treatment-emergent neuroendocrine prostate cancer
Nature Communications
Baca, S.C. and Takeda, D.Y. and Seo, J.-H. and Hwang, J. and Ku, S.Y. and Arafeh, R. and Arnoff, T. and Agarwal, S. and Bell, C. and O?Connor, E. and Qiu, X. and Alaiwi, S.A. and Corona, R.I. and Fonseca, M.A.S. and Giambartolomei, C. and Cejas, P. and Lim, K. and He, M. and Sheahan, A. and Nassar, A. and Berchuck, J.E. and Brown, L. and Nguyen, H.M. and Coleman, I.M. and Kaipainen, A. and De Sarkar, N. and Nelson, P.S. and Morrissey, C. and Korthauer, K. and Pomerantz, M.M. and Ellis, L. and Pasaniuc, B. and Lawrenson, K. and Kelly, K. and Zoubeidi, A. and Hahn, W.C. and Beltran, H. and Long, H.W. and Brown, M. and Corey, E. and Freedman, M.L.
DOI: 10.1038/s41467-021-22139-7

Transparency and reproducibility in artificial intelligence.
Haibe-Kains B and Adam GA and Hosny A and Khodakarami F and Massive Analysis Quality Control (MAQC) Society Board of Directors and Waldron L and Wang B and McIntosh C and Aerts HJWL
DOI: 10.1038/s41586-020-2766-y
PubMed: 33057217

Prostate cancer reactivates developmental epigenomic programs during metastatic progression.
Nature genetics
Pomerantz MM and Qiu X and Zhu Y and Takeda DY and Pan W and Baca SC and Gusev A and Korthauer KD and Severson TM and Ha G and Viswanathan SR and Seo JH and Nguyen HM and Zhang B and Freedman ML
DOI: 10.1038/s41588-020-0664-8
PubMed: 32690948

Detection of renal cell carcinoma using plasma and urine cell-free DNA methylomes.
Nature medicine
Nuzzo PV and Berchuck JE and Korthauer K and Spisak S and Nassar AH and Abou Alaiwi S and Chakravarthy A and Shen SY and Bakouny Z and Boccardo F and Steinharter J and Bouchard G and Freedman ML
DOI: 10.1038/s41591-020-0933-1
PubMed: 32572266

Detection of urothelial carcinoma using plasma cell-free methylated DNA.
Journal of Clinical Oncology
DOI: 10.1200/jco.2020.38.15_suppl.5046

Plasma cell-free DNA variant analysis compared with methylated DNA analysis in renal cell carcinoma.
Genetics in medicine : official journal of the American College of Medical Genetics
Lasseter K and Nassar AH and Hamieh L and Berchuck JE and Nuzzo PV and Korthauer K and Shinagare AB and Ogorek B and McKay R and Thorner AR and Lee GM and Braun DA and Bhatt RS and Kwiatkowski DJ
DOI: 10.1038/s41436-020-0801-x
PubMed: 32341571

Detection and accurate false discovery rate control of differentially methylated regions from whole genome bisulfite sequencing.
Biostatistics (Oxford, England)
Korthauer K and Chakraborty S and Benjamini Y and Irizarry RA
DOI: 10.1093/biostatistics/kxy007
PubMed: 29481604

A practical guide to methods controlling false discoveries in computational biology.
Genome biology
Korthauer K and Kimes PK and Duvallet C and Reyes A and Subramanian A and Teng M and Shukla C and Alm EJ and Hicks SC
DOI: 10.1186/s13059-019-1716-1
PubMed: 31164141

A practical guide to methods controlling false discoveries in computational biology
Keegan Korthauer and Patrick K Kimes and Claire Duvallet and Alejandro Reyes and Ayshwarya Subramanian and Mingxiang Teng and Chinmay Shukla and Eric J Alm and Stephanie C Hicks
DOI: 10.1101/458786

Genome-wide repressive capacity of promoter DNA methylation is revealed through epigenomic manipulation
Keegan Korthauer and Rafael A. Irizarry
DOI: 10.1101/381145

A Somatically Acquired Enhancer of the Androgen Receptor Is a Noncoding Driver in Advanced Prostate Cancer.
Takeda DY and Spisák S and Seo JH and Bell C and O'Connor E and Korthauer K and Ribli D and Csabai I and Solymosi N and Szállási Z and Stillman DR and Cejas P and Qiu X and Long HW and Freedman ML
DOI: 10.1016/j.cell.2018.05.037
PubMed: 29909987

High-throughput identification of RNA nuclear enrichment sequences
The EMBO Journal
Chinmay J Shukla and Alexandra L McCorkindale and Chiara Gerhardinger and Keegan D Korthauer and Moran N Cabili and David M Shechner and Rafael A Irizarry and Philipp G Maass and John L Rinn
DOI: 10.15252/embj.201798452

High-throughput identification of RNA nuclear enrichment sequences
Shukla CJ and McCorkindale AL and Gerhardinger C and Korthauer KD and Cabili MN and Shechner DM and Irizarry RA and Maass PG and Rinn JL
DOI: 10.1101/189654

Detection and accurate False Discovery Rate control of differentially methylated regions from Whole Genome Bisulfite Sequencing
Keegan D. Korthauer and Sutirtha Chakraborty and Yuval Benjamini and Rafael A. Irizarry
DOI: 10.1101/183210

IPI59: An Actionable Biomarker to Improve Treatment Response in Serous Ovarian Carcinoma Patients
Statistics in Biosciences
Choi, J. and Ye, S. and Eng, K.H. and Korthauer, K. and Bradley, W.H. and Rader, J.S. and Kendziorski, C.
DOI: 10.1007/s12561-016-9144-1

A statistical approach for identifying differential distributions in single-cell RNA-seq experiments.
Genome biology
Korthauer KD and Chu LF and Newton MA and Li Y and Thomson J and Stewart R and Kendziorski C
DOI: 10.1186/s13059-016-1077-y
PubMed: 27782827

scDD: A statistical approach for identifying differential distributions in single-cell RNA-seq experiments
Korthauer KD and Chu L and Newton MA and Li Y and Thomson J and Stewart R and Kendziorski C
DOI: 10.1101/035501

Chromosomal copy number alterations and HPV integration in cervical precancer and invasive cancer.
Bodelon C and Vinokurova S and Sampson JN and den Boon JA and Walker JL and Horswill MA and Korthauer K and Schiffman M and Sherman ME and Zuna RE and Mitchell J and Zhang X and Wentzensen N
DOI: 10.1093/carcin/bgv171
PubMed: 26660085

MADGiC: a model-based approach for identifying driver genes in cancer.
Bioinformatics (Oxford, England)
Korthauer KD and Kendziorski C
DOI: 10.1093/bioinformatics/btu858
PubMed: 25573922

Methods for collapsing multiple rare variants in whole-genome sequence data.
Genetic epidemiology
Sung YJ and Korthauer KD and Swartz MD and Engelman CD
DOI: 10.1002/gepi.21820
PubMed: 25112183

Limited model antigen expression by transgenic fungi induces disparate fates during differentiation of adoptively transferred T cell receptor transgenic CD4 + T cells: Robust activation and proliferation with weak effector function during recall
Infection and Immunity
Wüthrich, M. and Ersland, K. and Pick-Jacobs, J.C. and Gern, B.H. and Frye, C.A. and Sullivan, T.D. and Brennan, M.B. and Filutowicz, H.I. and O'brien, K. and Korthauer, K.D. and Schultz-Cherry, S. and Klein, B.S.
DOI: 10.1128/IAI.05326-11

The genetic network controlling the Arabidopsis transcriptional response to Pseudomonas syringae pv. maculicola: roles of major regulators and the phytotoxin coronatine.
Molecular plant-microbe interactions : MPMI
Wang L and Mitra RM and Hasselmann KD and Sato M and Lenarz-Wyatt L and Cohen JD and Katagiri F and Glazebrook J
DOI: 10.1094/mpmi-21-11-1408
PubMed: 18842091

Predicting Cancer Subtypes Using Survival-Supervised Latent Dirichlet Allocation Models
Advances in Statistical Bioinformatics
Keegan Korthauer and John Dawson and Christina Kendziorski
DOI: 10.1017/cbo9781139226448.019


Unraveling the spatial landscape of epigenomic signals
A common task in the interpretation of epigenomic data, which holds information about the genome not encoded in the DNA sequence itself, is the detection and inference of regions of interest. For example, it is of interest to detect segments of the genome that show significantly higher or lower DNA methylation levels with respect to disease state or developmental stage, as this particular modification to the DNA is known to influence gene regulation. However, the number of possible segments of all possible sizes is near infinite, leading to a massive multiple testing problem. Our group develops tailored statistical and computational approaches for powerful detection and inference of region-based epigenomic signals, while paying particular attention to spatial patterns. We are interested in designing and applying these techniques for the analysis of DNA methylation, histone modification, and chromatin accessibility assay data.

Predicting gene expression from epigenomic signals
It is widely known that epigenetic information, such as DNA methylation and histone modifications, plays a role in gene regulation. However, the prediction of gene expression from epigenomic signals is challenging due to interactions between different epigenomic marks as well as interactions between different regions of the genome. We are working on developing predictive models that account for these challenges and assess the predictive capacity for various epigenomic signals.

Understanding the genomic basis of complex traits
Our group develops computational approaches to study the genomic basis of a variety of complex traits. Our main focus areas currently include modeling the mutation spectrum of cancer genomes, revealing heterogeneity in single-cell gene expression during development, and characterizing the epigenomic landscape of prostate cancer. To maximize impact of our work, we also provide open source computational tools that enable other scientists to make meaningful biological insights.

Research Group Members

Giuliano Cruz, Graduate Research Assistant
Erick Navarro
Ning Shen, Graduate Research Assistant