Ph.D., Princeton University, 2008
Sequence-function relationships; machine learning; biophysics; transcriptional regulation
My research combines theory, computation, and experiment in an effort to better understand the relationship between sequence and function in molecular biology.
Ultra-high-throughput DNA sequencing is revolutionizing the ability to measure sequence-function relationships in a wide variety of systems. In Kinney et al. (2010), an ultra-high-throughput promoter-bashing assay called ``Sort-Seq’’ was proposed and demonstrated. Sort-Seq uses flow cytometry and deep sequencing to probe the detailed biophysical mechanisms of transcriptional regulation in vivo. My current experimental work uses Sort-Seq in studies of transcriptional regulation, as well as in other areas of molecular biology.
The analysis of Sort-Seq data also presents novel challenges in machine learning, and a substantial fraction of my work is devoted to addressing these theoretical and computational problems. For instance, the need to fit models to Sort-Seq data highlights the general problem of fitting parametric models to data in the absence of a known noise model. In considering this general problem, Kinney and Atwal (2014a) showed that maximizing a quantity from information theory called “mutual information” allows one to essentially solve the maximum likelihood problem without a noise model. This has important implications for the design of Sort-Seq-like experiments in molecular biology, as well as for studies of receptive fields in neuroscience.
Fitting models to Sort-Seq data also requires the ability to estimate probability densities with high precision and explicit uncertainty. This challenge motivated the development of a new field-theoretic approach to density estimation (Kinney, 2013). This nonparametric method of density estimation has essentially no free tunable parameters, and provides an attractive alternative to more standard methods such as kernel density estimation.
Please visit Kinney's Lab home page.
Kinney JB, Atwal GS (2014a) Parametric inference in the large data limit using maximally informative models. Neural Comput (early access) doi:10.1162/NECO a 00568. arXiv:1212.3647 [q-bio.QM], bioRxiv doi:10.1101/001396.
Kinney JB, Atwal GS (2014b) Equitability, mutual information, and the maximal information coefficient. Proc Natl Acad Sci USA (in press). arXiv:1301.7745 [q-bio.QM].
Kinney JB (2013) Estimation of probability densities using scale-free field theories. arXiv:1312.6661 [physics.data-an].
Melnikov A, Murugan A, Zhang X, Tesileanu T, Wang L, Rogov P, Feizi S, Gnirke A, Callan CG, Kinney JB, Kellis M, Lander ES, Mikkelsen TS (2012) Rapid dissection and model-based optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nat Biotechnol 30(3):271-277.
Kinney JB, Murugan A, Callan CG, Cox EC (2010) Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence. Proc Natl Acad Sci USA 107:9158-9163.
Mustonen V, Kinney JB, Callan CG, Lässig M (2008) Energy-dependent fitness: a quantitative model for the evolution of yeast transcription factor binding sites. Proc Natl Acad Sci USA 105:12376-12381.
Kinney JB, Tkačik G, Callan CG (2007) Precise physical models of protein-DNA interaction from high-throughput data. Proc Natl Acad Sci USA 104:501-506.