Earth BioGenome Project (EBP) – Exploring the Dark Matter of Biology
August 14, 2019Earth BioGenome Project (EBP)
Earth BioGenome Project (EBP), a moonshot for biology that aims to sequence, catalog, and characterize the genomes of all of Earth’s eukaryotic biodiversity over a period of 10 years.
It is possible to efficiently sequence the genomes of all known species, and to use genomics to help discover the remaining 80 to 90 percent of species that are currently hidden from science.
How Earth BioGenome Project Started?
A conceptual argument for sequencing eukaryotic life was made by Stephen Richards in 2015. Richards argued that current technology would permit sequencing a vast number of species and that a phylogenetic approach to stratifying samples for sequencing would accelerate scientific discovery. Independently, in November 2015, an exploratory meeting that included representatives from research universities and major international and US federal funding agencies was held at the Smithsonian Institution to discuss the rationale, strategies, and feasibility of sequencing all life on Earth, a venture termed The Earth BioGenome Project (EBP).
Priority aims of EBP Project & 10 Year Plan
To decipher the genomes of every species, starting with the 1.5 million named eukaryotes—the group of organisms that includes plants, animals, and microbes such as amoebas.
As currently proposed, the EBP’s first step would be to sequence in great detail the DNA of a member of each eukaryotic family (about 9000 in all) to create genomes on par or better than the current reference human genome: complete enough that researchers know the order of genes on each chromosome.
Next would come coarser sequencing of one species from each of the 150,000 to 200,000 genera—similar to scores of existing plant and animal genomes
Finally, scientists would seek rough genomes of the remaining known eukaryotic species. Those could be refined as needed
Other ongoing projects that will help EBP
Since the Human Genome Project was completed in 2003, large-scale genome sequencing efforts have proliferated. For example, Genome 10K was launched in 2009 to sequence the genomes of at least one individual from each vertebrate genus, approximately 10,000 genomes. Two years later, the i5K was unveiled as an initiative to sequence the genomes of 5000 arthropod species. 2015 saw the announcement of the B10K Project, which plans to generate representative draft genome sequences from all extant bird species within five years. The list goes on and on.The EBP would help coordinate, compile, and perhaps fund these efforts.
Details of other ongoing projects
1000 Fungal Genomes Project (1KFG)
Global Invertebrate Genomics Alliance (GIGA)
Global Ant Genomics Alliance (GAGA)
5,000 Insect Genome Project (i5K)
Ag100 Pests (USDA)
10,000 Plant Genomes Project (10KP)
Bird 10,000 Genomes (B10K)
Genome 10K Project
Oz Mammals Genomics Framework Data Initiative (OMG)
Darwin Tree of Life
LOEWE-Centre for Translational Biodiversity Genomics
University of California Consortium for the Earth BioGenome Project (CalEBP)
Chilean 1000 Genomes Project
Taiwan Biogenome Project
Global Genome Initiative (GGI)
Major institutions involved in EBP
Several large sequencing centers are supporting the goals of the EBP, including BGI (China), Baylor College of Medicine (the United States), the Sanger Institute (the United Kingdom), and Rockefeller University (the United States) and The São Paulo Research Foundation (Brazil), adding to the global hub-and-spokes model envisioned for the EBP.
Details of other institutions taking part in EBP
Australian Museum
Baylor College of Medicine
BioPlatforms Australia
Beijing Genomics Institute at Shenzhen
George Washington University
Natural History Museum of Denmark
Max-Planck Society
Novim Group
Royal Botanic Gardens at Kew
SpaceTime Ventures
University of California, Davis
University of California, Santa Cruz
University of Santiago
University of Florida
University of Illinois at Urbana-Champaign
University of Sydney
Wellcome Sanger Institute
Goals of the project
1. Benefiting Human Welfare
Developing new treatments for infectious and inherited disease
Identify drug to slow or reverse aging
Create new biological synthetic fuels
Generate new bio materials
Genearate new approaches to feeding the world
2. Protecting Biodiversity
By the year 2050, up to 50% of existing species may become extinct mainly due to natural resource-intensive industries. Humanity faces the question of how such massive losses of species diversity will affect the complex ecosystems that sustain life on Earth, including our ability to derive the foods, biomaterials, bioenergy, and medicines necessary to support an expected human population of 9.6 billion by 2050. Ecosystem collapse on a global scale is a real possibility, making the preservation and conservation of terrestrial, marine, freshwater, desert, and agricultural ecosystems a global imperative for human survival and prosperity.Unimaginable biological secrets are held in the genomes of the millions of known and unknown organisms on our planet. This “dark matter” of biology could hold the key to unlocking the potential for sustaining planetary ecosystems on which we depend and provide life support systems for a burgeoning world population. These organisms and their genomes will provide the raw materials for genome engineering and synthetic biology approaches to produce valuable bioproducts at industrial scale.
3. Understanding Ecosystems
An urgent demand exists for new sources of food proteins that can be produced cheaply and at scale, new medicines for treating the increasing frequency of chronic diseases plaguing human populations, new strategies for controlling outbreaks of zoonotic diseases, and new resources for maintaining and improving the quality of soil, air, and water.Obtaining the genetic blueprints for all eukaryotic life and eventually, the vast numbers of Bacteria and Archaea will create a powerful source of discovery for improving and increasing ecosystem services.
Outcome of this project
EBP will inform a broad range of major issues facing humanity, such as the impact of climate change on biodiversity, the conservation of endangered species and ecosystems, and the preservation and enhancement of ecosystem services.
The initiative would produce an open DNA database of biological information that provides a platform for scientific research and supports environmental and conservation initiatives.
Challenges of EBP
The EBP will generate opportunities and challenges for new tools to visualize, compare, and understand the connection of genome sequence to the evolution of phenotype, organism, and ecosystems.This challenge will require new architectures, algorithms, and software for improved quality, efficiency, and cost-effectiveness as well as data analysis, big data visualization, and sharing.
For example, storage and distribution of reference genomes, annotations, and analyses will likely require less than 10 gigabytes per species or ∼20 petabytes in total . Storage of the underlying sequence read data for the completed EBP is more challenging at ∼200 petabytes.Mammalian-sized long-read genome assemblies currently require ∼100 processor-weeks.Although current tools are already capable of completing the project, there is no doubt that assembly, alignment, and annotation algorithms implemented in both hardware and software will, in the future, all need to be improved for efficiency, accuracy, and application to difficult genomes, such as very large, very repetitive, or very polymorphic genomes.
To ensure proper documentation of genetic resources and access and benefit sharing compliance, the EBP will promote downstream monitoring and tracking of utilized genetic and genomic resources.
Total Project Cost
With the current cost of US $1,000 for sequencing an average vertebrate-sized genome to draft level, genomes of all ∼1.5 million known eukaryotes, up to 100,000 new eukaryotic species, and a defined number of eDNA samples from biodiversity hotspot collection sites can be sequenced to a high level of completeness and accuracy for approximately US $4.7 billion. Incredibly, this is less than the cost of creating the first draft human genome sequence (US $2.7 billion) in today’s dollars (US $4.8 billion).
Economic benefit of this Project
The economic impact of the EBP is likely to be very large and globally distributed.The technologies arising from investments in genomics are having a profound effect on human medicine, veterinary medicine, renewable energy development, food and agriculture, environmental protection, industrial biotechnology, the justice system, and national security. It is quite reasonable to assume that sequencing the remaining 99.8% of eukaryotic species will yield returns similar to or exceeding those of the Human Genome Project.
References
1.Lewin, H. A., Robinson, G. E., Kress, W. J., Baker, W. J., Coddington, J., Crandall, K. A., … & Goldstein, M. M. (2018). Earth BioGenome Project: Sequencing life for the future of life. Proceedings of the National Academy of Sciences, 115(17), 4325-4333.
2.Pennisi, E. (2017). Sequencing all life captivates biologists.
3.https://www.earthbiogenome.org/