Uncovering Genome Mysteries

What will this project do?

The project will compare about 200 million proteins encoded by the genes from a wide variety of known and unknown organisms. These genes came from organisms in samples taken from a range of environments, including water and soil, as well as on and in plants and animals. DNA from all the organisms in those samples (the metagenome) was extracted and analyzed to identify genes that encode proteins, most of which are enzymes. Uncovering Genome Mysteries will compare the proteins encoded by those genes to one another, both individually and in groups, to find genetic similarities. Such similarities can reveal the functions these organisms perform in various natural processes. Scientists can then use that knowledge to design solutions to solve important environmental, medical and industrial problems.

Why are gene comparisons important?

Because of recent advances in DNA sequencing technology, there is now a huge amount of gene information available for a wide variety of organisms, with more being decoded every day. Many of these organisms, particularly microorganisms, have never been studied in detail before. We therefore know little about what they can do, and how they interact with their environment. However, it is likely that many genes from unknown organisms will be similar to genes from organisms that we know more about. When similarities are found, researchers get a head start in understanding previously unknown organisms.

What will the results of this project be?

The researchers will publish an open-access database of the protein sequence comparisons computed on World Community Grid.

We expect that this information will help scientists discover new enzymatic functions, find how organisms interact with each other and the environment, document the current baseline microbial diversity, and better understand and model complex microbial systems.

What are the expected benefits of this project?

There are two main areas where this research is expected to have a beneficial effect: current scientific research, and future technologies.

On the research side, the results should help improve scientific knowledge about gene and protein functions and biochemical processes in general, as well as helping scientists understand how microbial communities are changing in response to changing conditions in the natural world.

There are also several exciting ways in which this knowledge may help solve pressing world problems. For example, new knowledge about organisms should help identify, design and produce new antibiotics and drugs against diseases, as well as new enzymes for industrial applications, such as food processing, chemical synthesis, or the production of biodegradable plastics or biofuels. In the long-term this knowledge should help us manage the diverse organisms’ important functions in the world's ecosystem, in all environments, in industrial settings, and in human, animal and plant interactions.

What is DNA?

DNA stands for deoxyribonucleic acid. DNA strands are molecules that act as blueprints for all living things. A single DNA molecule consists of a helical (coil shaped) strand or chain, consisting of four chemical “letters” that make up phrases (“genes”) and the genetic code. These letters are A, C, T and G and stand for the four types of compounds (adenine, cytosine, thymine, and guanine), which are assembled to form the DNA molecule’s gene codes.

What are genes?

Genes are “DNA phrases” that encode for proteins. Specific three-letter DNA sequences each encode one specific amino acid. Chains of amino acids form proteins, some of which contribute to the structure of a cell (such as a microorganism) while others act as enzymes. Learn more.

What are proteins?

A Protein is a chain of amino acids that folds in a particular structure necessary for the function of that protein. The chain can be composed of up to 20 different kinds of amino acids, and the types and order of those amino acids are encoded in the gene sequence (the genetic code). The amino acid sequence is also known as the “protein sequence” because there are multiple gene sequences that can specify the same protein sequence. A cell is made of thousands of proteins (in addition to fatty molecules called lipids, sugars and other chemicals) that can have either a structural function or an enzymatic activity. Enzymes are proteins that help break down other molecules or build new ones. Several enzymes can work in concert to convert molecules into other chemical building blocks for the cell (for example, sugar into lipids), or to extract energy from sugar.

What are enzymes?

Enzymes are proteins that convert chemicals or act as catalysts. Certain enzymes in plants, for example, can assist in the absorption of carbon dioxide molecules and incorporate them into other cellular molecules.

What is DNA sequencing?

DNA sequencing is a technology to determine the sequence of the four “letters” (A, C, T, G) that encode for genes, by chemically analyzing DNA molecules.

How do you understand an organism's function from a DNA sequence?

In the first step we convert the DNA sequence into an amino acid sequence. This amino acid sequence then defines the properties of a protein. By comparing the amino acid sequence with other known sequences in databases, we can use the information about previously studied proteins to predict the functions of new proteins being investigated. If we know the function of all the proteins encoded by a genome, then we can ultimately understand how a cell or microorganism works.

What is the difference between a genome and a metagenome?

A genome consists of all the genetic code for an individual organism, while a metagenome describes all genes and elements encoded in a group or community of organisms, for example, all of the microorganisms within a sample of soil or ocean.

What are microorganisms?

Microorganisms are microscopically small life forms, mostly single celled, and include bacteria, archaea, protozoa, yeasts and microscopic algae. Members of these diverse groups are present in almost all environments on earth: in the air, water, earth, rocks, and even where conditions are very harsh, such as the deep ocean and polar environments. They play a crucial role in maintaining all ecological systems and interact closely with one another and with other life forms. They are present in and around other living systems, such as plants, animals and humans.

Why are microorganisms important?

Microorganisms represent the great unseen and under appreciated majority of life on our planet. They are everywhere in the environment and in larger, more complex organisms. They are important for a huge variety of natural processes, including human health, agriculture and food production. For almost any kind of organic molecule, there will be a microorganism that has evolved the capacity to decompose, change, or construct it.

Why is understanding microorganism function important?

Without a proper functioning of microorganisms the health of our planet would quickly deteriorate and higher organisms, including humans, would cease to exist. Despite their importance for our planet’s health, we know little about the diversity and function of microorganisms in the environment. Microorganisms also harbor new and unexpected functions that can be harnessed for biotechnological processes, such as food or drug production.