MaveDB: database to aid research into genetic causes of disease

Posted: 5 November 2019 | | No comments yet

Researchers in Australia and the US have launched the first open-source database detailing genetic variants that impact human health and disease.

MaveDB image

First of its kind database MaveDB holds datasets from experiments – called multiplex assays of variant effect (MAVEs) – that systematically measure the impact of thousands of individual sequence variants on a gene’s function.

These experiments can provide valuable information about which proteins are produced by that gene function; how variants in that gene may contribute to disease; and how to engineer synthetic versions of naturally occurring proteins that are more effective than the original protein.

MaveDB is the first publicly accessible database for this data and its development was led by Dr Alan Rubin from the Walter and Eliza Hall Institute, Australia; Associate Professor Douglas Fowler from the University of Washington, US; and Professor Frederick Roth from the University of Toronto, Canada.

MAVEs progress understanding of gene function

MAVEs have revolutionised researchers’ ability to understand gene functions and their roles in disease, Dr Rubin said.

“In the past, researchers had to focus on a handful of changes in a gene to understand its function,” he said. “It was too complex to generate the data from an exhaustive scan of variants of a gene that might be hundreds or thousands of bases in length.

“The development of MAVEs provided a way for researchers to experimentally measure every single genetic change in a gene with its functional consequence. These assays can handle tens of thousands of genetic variants, allowing researchers to home in on the relevant changes and place them in context.”

Until now, MAVE data from experiments has existed in isolation, with data from individual studies uploaded to journal websites when research papers are published, or provided upon request to other researchers. 

“This made it hard for researchers to access the data of other groups, or even know that a particular MAVE experiment had been done. So it potentially hindered collaborations and the progress of genomics research,” Dr Rubin said.

“MaveDB makes it easier for scientists to share their datasets in a single location, using a flexible format that is applicable to multiple research fields, and enables other scientists to easily access this data to enhance their research. We’ve also ensured MaveDB can ‘talk’ to other databases to add an extra level of collaborative capacity.”

Visualising datasets with MaveVis

In addition to MaveDB, the team has also developed data visualisation software, called MaveVis, which facilitates better understanding and interpretation of MAVE experiments. 

“MaveVis provides an immediate and consistent display for MAVE data, including valuable annotations such as protein structure information, that will accelerate collaborative research,” Dr Rubin said. 

“We envision that as MaveDB becomes more widely used within the bioinformatics community, other applications will be added that provide new ways to visualise and interpret complex genomics data – leading to new discoveries that enhance biomedical research. This could underpin the development of new medicines, or the understanding of how a patient’s genomic variants contribute to a disease.”

MaveDB was described in Genome Biology.