Deep learning has the potential to make a significant impact in biology and healthcare, but a major challenge is understanding the reasons behind their predictions. My research develops methods to interpret this powerful class of black box models, with a goal of elucidating data-driven insights into the underlying mechanisms of sequence-function relationships.
Deep learning is being applied rapidly in many areas of genomics, demonstrating improved performance over previous methods on benchmark datasets. Despite the promise of deep learning, it remains unclear whether improved predictions will translate to new biological discoveries because of their low interpretability, which has earned them a reputation as a black box. Understanding the reasons underlying a deep learning model’s prediction may reveal new biological insights not captured by previous methods. Our group develops methods to interpret high-performing deep learning models to distill knowledge that they learn from big, noisy, biological sequence data. Our goal is to elucidate biological mechanisms that underlie sequence-function relationships for gene regulation and protein (dys)function, with a broader aim of advancing precision medicine for complex diseases, including cancer.