Artificial intelligence trained to find disease-related genes
Researchers have developed an artificial neural network using deep learning to identify genes that are related to disease.
An artificial neural network has revealed patterns in huge amounts of gene expression data and discovered groups of disease-related genes. The developers, from Linköping University, Sweden, hope that the method can eventually be applied within precision medicine and individualised treatment.
The scientists created maps of biological systems based on how different proteins or genes interact with each other. Using artificial intelligence (AI), they investigated whether it is possible to discover biological networks with deep learning, in which entities known as artificial neural networks are trained by experimental data.
“We have for the first time used deep learning to find disease-related genes. This is a very powerful method in the analysis of huge amounts of biological information, or ‘big data’,” said Sanjiv Dwivedi, from the Department of Physics, Chemistry and Biology (IFM) at Linköping University.
The scientists used a large database with information about the expression patterns of 20,000 genes in a large number of people. The information was ‘unsorted’, as the researchers did not give the artificial neural network data about which gene expression patterns were from people with diseases and which were from healthy people. The AI model was then trained to find patterns of gene expression.
Artificial neural networks consist of several layers in which information is mathematically processed. The system comprises an input layer and an output layer that delivers the result of the information processing. Between these two, are several hidden layers in which calculations are carried out. When the scientists trained the artificial neural network, they wondered whether it was possible to understand exactly how it works.
“When we analysed our neural network, it turned out that the first hidden layer represented to a large extent interactions between various proteins. Deeper in the model, in contrast, on the third level, we found groups of different cell types. It’s extremely interesting that this type of biologically relevant grouping is automatically produced, given that our network has started from unclassified gene expression data,” said Mika Gustafsson, senior lecturer at IFM and leader of the study.
The scientists then investigated whether their model of gene expression could be used to determine which gene expression patterns are associated with disease and which is healthy. They confirmed that the model finds relevant patterns that validate biological mechanisms in the body. Since the model has been trained using unclassified data, it is possible that the artificial neural network has found totally new patterns. The researchers plan now to investigate whether such, previously unknown patterns, are relevant from a biological perspective.
“We believe that the key to progress in the field is to understand the neural network. This can teach us new things about biological contexts, such as diseases in which many factors interact. And we believe that our method gives models that are easier to generalise and that can be used for many different types of biological information,” said Gustafsson.
The study was published in Nature Communications.