Making AI algorithms show their work

Thursday, 13 May 2021

The Takeaway

An AI neural network built to predict protein and RNA sequence interactions cannot explain what patterns it sees. CSHL Assistant Professor Peter Koo found a way to “quiz” the network with a carefully designed set of synthetic RNA sequences to find out what it learned. The answer was a little surprising.

Artificial intelligence (AI) learning machines can be trained to solve problems and puzzles on their own instead of using rules that we made for them. But often, researchers do not know what rules the machines make for themselves. Cold Spring Harbor Laboratory (CSHL) Assistant Professor Peter Koo developed a new method that quizzes a machine-learning program to figure out what rules it learned on its own and if they are the right ones.

Computer scientists “train” an AI machine to make predictions by presenting it with a set of data. The machine extracts a series of rules and operations—a model—based on information it encountered during its training. Koo says:

“If you learn general rules about the math instead of memorizing the equations, you know how to solve those equations. So rather than just memorizing those equations, we hope that these models are learning to solve it and now we can give it any equation and it will solve it.”

Koo developed a type of AI called a deep neural network (DNN) to look for patterns in RNA strands that increase the ability of a protein to bind to them. Koo trained his DNN, called Residual Bind (RB), with thousands of RNA sequences matched to protein binding scores, and RB became good at predicting scores for new RNA sequences. But Koo did not know whether the machine was focusing on a short sequence of RNA letters—a motif—that humans might expect, or some other secondary characteristic of the RNA strands that they might not.

Koo and his team developed a new method, called Global Importance Analysis, to test what rules RB generated to make its predictions. He presented the trained network with a carefully designed set of synthetic RNA sequences containing different combinations of motifs and features that the scientists thought might influence RB’s assessments.

They discovered the network considered more than just the spelling of a short motif. It factored in how the RNA strand might fold over and bind to itself, how close one motif is to another, and other features.

Koo hopes to test some key results in a laboratory. But rather than test every prediction in that lab, Koo’s new method acts like a virtual lab. Researchers can design and test millions of different variables computationally, far more than humans could test in a real-world lab.

“Biology is super anecdotal. You can find a sequence, you can find a pattern but you don’t know ‘Is that pattern really important?’ You have to do these interventional experiments. In this case, all my experiments are all done by just asking the neural network.”

The team published their new methods and tools in PLOS Computational Biology. Their tools are now available to everyone online.

Written by: Luis Sandoval, Communications Specialist | sandova@cshl.edu | 516-367-6826

Funding

Cancer Center Support Grant, the Simons Center for Quantitative Biology at Cold Spring Harbor Laboratory, National Cancer Institute

Citation

Koo, P., et al., “Global Importance Analysis: An Interpretability Method to Quantify Importance of Genomic Features in Deep Neural Networks”, PLOS Computational Biology, May 13, 2021. DOI: 10.1371/journal.pcbi.1008925

The Takeaway

Principal Investigator

Peter Koo

Associate Professor
Cancer Center Member
Ph.D., Yale University, 2015

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

The Takeaway

The Takeaway

Principal Investigator

Peter Koo

Tags

Contact

Connect with CSHL

The Takeaway

Stay informed

The Takeaway

Principal Investigator

Peter Koo

Tags

DISCOVER: Related stories

AI training: A backward cat pic is still a cat pic

AI researchers ask: What’s going on inside the black box?

AI is helping scientists explain our brain