A pioneering Artificial Intelligence (AI) powered model able to understand the sequences and structure patterns that make up the genetic “language” of plants, has been launched by a research collaboration.
Plant RNA-FM, believed to be the first AI model of its kind, has been developed by a collaboration between plant researchers at the John Innes Centre and computer scientists at the University of Exeter.
The model, say its creators, is a smart technological breakthrough that can drive discovery and innovation in plant science and potentially across the study of invertebrates and bacteria.
RNA, like its better-known chemical relative DNA, is an important molecule throughout all organisms, responsible for carrying genetic information in its sequences and structures. In the genome RNA architecture is made up of combinations of building blocks called nucleotides, which are arranged in patterns in the same way that the alphabet combines to make words and phrases in language.
Professor Yiliang Ding’s group at the John Innes Centre studies RNA structure, one of the key languages in RNA molecules where RNAs can fold into complex structures that regulate sophisticated biological functions such as plant growth and stress response.
To better understand the complex language of RNA in its functions, Professor Ding’s group collaborated with Dr Ke Li’s group in the University of Exeter.
Together they developed PlantRNA-FM, a model trained on an enormous data set of 54 billion pieces of RNA information that make up a genetic alphabet across 1,124 plant species.
When creating PlantRNA-FM the researchers followed the methodology in which AI models such as ChatGPT are trained to understand human language. The AI model was taught the plant-based language by studying RNA information from plant species worldwide, to give it a comprehensive view of how RNA works across the plant kingdom.
Just as ChatGPT can understand and respond to human language, PlantRNA-FM has learned to understand the grammar and logic of RNA sequences and structures.
The researchers have already used the model to make precise predictions about RNA functions and to identify specific functional RNA structural patterns across the transcriptomes. Their predictions have been validated by experiments which confirm that RNA structures identified by PlantRNA-FM influence the efficiency of the translation of genetic information into protein.
“While RNA sequences may appear random to the human eye, our AI model has learned to decode the hidden patterns within them,” says Dr Haopeng Yu, the postdoc researcher in Professor Yiliang Ding’s group at the John Innes Centre.
This successful collaboration was also supported by scientists from Northeast Normal University and the Chinese Academy of Sciences in China contributed to this work.
Professor Ding said: “Our PlantRNA-FM is just the beginning. We are working closely with Dr Li’s group to develop more advanced AI approaches to understand the hidden DNA and RNA languages in nature. This breakthrough opens new possibilities for understanding and potentially programming plants which could have profound implications for crop improvement and the next generation of AI-based gene design. AI is increasingly instrumental in helping plant scientists tackle challenges, from feeding a global population to developing crops that can thrive in a changing climate.”
An Interpretable RNA Foundation Model for Exploration Functional RNA Motifs in Plants appears in Nature Machine Intelligence.