Human Cell Atlas data platform kicks off with support from CZI
The Chan Zuckerberg Initiative announced financial support for the Human Cell Atlas, which uses sequencing technology to redefine every cell in the body…
Funding and engineering support from CZI will enable the European Bioinformatics Institute (EMBL-EBI), the Broad Institute and the University of California Santa Cruz Genomics Institute (UCSC) to set up an open, cloud-based Data Coordination Platform to check, share and analyse the vast amounts of diverse information generated.
The new anatomy
Molecular biology has advanced so far and fast in the past two decades that scientists believe it’s time to rethink human anatomy, starting from the smallest unit: the cell. The Human Cell Atlas, led by the Broad Institute and Wellcome Trust Sanger Institute, aims to do just that by creating a new, open, accessible reference map of the healthy human body.
“Anatomy textbooks as they are now were designed by assigning meaning according to how things look and function. Now, we’re using molecular tools to characterise what’s going on in organs and tissues, and to get a deeper view of anatomy. That’s the Human Cell Atlas,”
explains Dr John Marioni, Research Group Leader at EMBL-EBI, and EMBL-EBI lead on the Data Coordination Platform steering group.
This international collaboration is using RNA sequencing technology to define cells in a whole new way. Such a highly specific, sequencing-based reference of healthy human function will be transformative for biomedical research.
Transparency and transformation
“The scale of the Atlas will be in the tens of millions of datasets,” says Dr Sarah Teichmann, Head of Cellular Genetics at the Wellcome Trust Sanger Institute and joint leader of the Human Cell Atlas. “Interoperability and transparency are essential for keeping so many moving parts working – we know this from our long experience collaborating with one another. We’ve designed the data architecture as open-source and modular from the get-go. That will make it easier for others to use and add to the Atlas in the future.”
The new cloud-based pipeline will allow Human Cell Atlas partners to upload their datasets, analyse them jointly, and compare healthy and diseased tissues meaningfully. It will shift life-science collaboration toward cloud technologies including Open Stack, Google and Amazon Web Services.
“The size and scope of this new data platform will require large-scale collaborations between informatics and genomics experts across academia and industry,” said Cori Bargmann, president of science at the Chan Zuckerberg Initiative.
“That is why we are thrilled to bring together three of the world’s leading institutions in genomics, informatics, and data sharing to build this important new resource – and our own software engineers will help develop the tools and facilitate the collaboration. It is a great example of how we can help accelerate science by supporting collaborations across institutions and by bringing scientists and engineers together in new ways.”
Science is global
The raw data produced by Human Cell Atlas researchers will be stored and accessed at EMBL-EBI, flowed to platform partners in the US for cloud-based analysis and annotation, then sent back to EMBL-EBI to be stored and shared in the public archives, making it available to the wider world.
“Science is truly international, and that is clear in the way the Human Cell Atlas partners work across continents,” says Ewan Birney, Director of EMBL-EBI and Chair of the Global Alliance for Genomics and Health. “Each partner brings substantial experience building essential data services for the life sciences. CZI is not just funding the project – they’re a hands-on partner. So we know the Atlas will be built with the best engineering possible.”
“This contribution is for all the world’s biomedical scientists, because the Human Cell Atlas will be shared with everyone,” says Dr Aviv Regev, a professor of biology at MIT and co-chair of the organising committee for Human Cell Atlas. “CZI’s support will help us start to build a data platform for scientists around the world to see and analyse each other’s data, and to share the results of their work widely and openly. This will inspire others to ask new questions, and empower them to find the answers.”
Building the platform is just the start of a colossal undertaking that will take many years to complete, during which technologies will inevitably change. The next step for the Data Coordination Platform is to plan for emerging technologies such as bioimaging, and for sustaining the public resource over the long term.
Broad Institute, Chan Zuckerberg Initiative (CZI), European Bioinformatics Institute (EMBL-EBI), University of California Santa Cruz Genomics Institute (UCSC), Wellcome Trust Sanger Institute