Microbiome Immunity Project



What are the goals of the Microbiome Immunity Project?

The Microbiome Immunity Project aims to help scientists understand how trillions of bacteria in our bodies impact diseases such as Type 1 diabetes and Crohn's disease.

The primary goal of the project is therefore to generate a set of predicted protein structures of the entire human microbiome, containing around 3 million unique genes. This will help scientists determine the role played by these bacteria. Another goal is to share the results of the project with scientists around the world to further facilitate research on diseases implicated with the microbiome.



How are the goals of the project being met?

The Microbiome Immunity Project will meet its goals by using a computational research technique called protein structure prediction. This is a process through which computers simulate how a protein 1-dimensional sequence folds into its final 3-dimensional structure. (For more information about computational protein folding, see the Human Proteome Folding project.)

Knowing the structures of proteins of the microbiome can help researchers predict the functions of these proteins. An understanding of the role of these proteins will then help scientists develop drugs to control them or inhibit harmful interactions and therefore help treat diseases that originate in or are influenced by the human microbiome.



Who are the scientists involved in this study?

The Microbiome Immunity Project brings together researchers at the Broad Institute of MIT and Harvard, Massachusetts General Hospital, University of California San Diego and the Simons Foundation’s Flatiron Institute.

The Broad Institute brings expertise on the role of the human microbiome in health and disease to the project. By coupling microbiome analysis with the clinical knowledge at Massachusetts General Hospital, they analyze data generated from individuals with these diseases to prioritize genes from bacteria that are relevant in autoimmune diseases such as Inflammatory Bowel Disease and Type 1 Diabetes.

The Knight Lab at the University of California San Diego brings knowledge on and expertise in microbial genomes. They prepare input data for World Community Grid based on information from the Broad Institute. After obtaining results from World Community Grid, with the help of the Flatiron Institute, they annotate protein functions inferred from structures. The Knight Lab will also coordinate efforts to predict protein-protein interactions and design small molecules. They will also build a resource to collect all of the Microbiome Immunity Project predictions and share them with researchers from around the world.

The Flatiron Institute provides the expertise in predicting protein structure and function. They will work with the Knight Lab to further develop these codes to predict large microbial protein families, about which little is currently known.



How might the data generated on this project be useful to researchers?

The project's structure and function predictions for each of the proteins encoded by each unique gene in the human gut microbiome will be made available as an online resource for researchers interested in furthering the impact of this project.

Knowledge about gene function is critical for understanding not only which bacteria live in a specific environment, but what they actually do. By compiling their function predictions, the research team will greatly enhance the repertoire of annotated genes, and therefore help other researchers to better understand what specific microbes or communities of microbes are doing.

Additionally, the researchers aim to design small molecules (i.e. drug candidates) which inhibit harmful interactions between microbial and human proteins in Type 1 diabetes and inflammatory bowel disease. Designing an effective drug is an extremely complicated and laborious process, so through scientific publications the research team will encourage other researchers to investigate those molecules further and open up avenues for entirely new therapeutics.



Has this kind of research been attempted before? If so, how is this different?

It is only now that, for the first time, scientists can bring progress in next generation sequencing and their knowledge and understanding of microbes, together with massive computational power and newer algorithms to accurately predict structures and functions of hundreds of thousands (or more) of proteins!

For decades, scientists have studied both proteins and their structures, as well as microbes and how they impact human health. However, those studies were greatly limited in terms of their scale (e.g. by studying one microbe at a time) and scope. Similarly, structures of individual proteins have been experimentally determined since 1958 and computational investigations of protein structures began in early 1970s.

A turning point came in the early 2000s with the introduction of next generation sequencing, due to progress in computing power and the development of new algorithms. Thanks to next generation sequencing, obtaining DNA sequences encoding genes became much cheaper and quicker.

Around the same time new tools, such as Rosetta (which is being used for this project), were being developed to computationally predict protein structure and were, in fact, used for the Human Proteome Folding project on World Community Grid.

Since then, these tools have been refined and enhanced. Combined with the massive computational resources of World Community Grid, a project of this scale has only now become possible.



How is World Community Grid helping with this effort?

While effective, protein folding simulations are resource-intensive and often require more computational power than scientists typically have access to. The Microbiome Immunity Project research team is therefore enlisting the help of World Community Grid volunteers, each of whom runs these simulations on their computers. Each of these simulations is a virtual experiment to predict the structure of a protein.

The massive amount of aggregated computation power World Community Grid brings to this project will greatly advance and accelerate this new area of health research.



How can I help?

Anyone with a computer can help scientists understand how the human microbiome impacts disease, simply by joining World Community Grid.

It's easy: you create a World Community Grid account, select to support the Microbiome Immunity Project and then install our free and safe software on your computer. Then, whenever your computer has any unused computing power, it runs a simulation on behalf of the Microbiome Immunity Project team. The more people that participate, the quicker the researchers can get their work done!



What is the human microbiome?

The human microbiota is the collection of all of the microorganisms that live in and on your body, alongside and amongst your human cells. This includes bacteria, fungi, viruses and other microbes. The human microbiome is the collection of genes from the microbiota, although the term is usually used to refer to the human microbiota as well.

The number of non-human microorganisms in your body is estimated to be approximately 30 trillion, which is believed to be similar to the number of human cells in the body (though these estimates vary by an order of magnitude). However, the cells of these microorganisms have about 30 million genes, compared to only about 20,000 genes for the human cells. By either measure, the human microbiome plays an important role in the body. Many of the microorganisms are beneficial but some—or the lack of some—can be associated with diseases.



What is the relationship between the human microbiome and disease?

The proteins produced by the human microbiome can interfere with normal body processes, for example by interacting with mimicking other proteins in the body.

Most of these proteins have not been explored in detail. However, there appears to be a link between the populations of microorganisms in the human gut and diseases like Type 1 diabetes, Crohn's disease, and ulcerative colitis. Understanding more about the human microbiome should uncover the link to these and other diseases. Once scientists discover which proteins play key roles in these diseases, they can then turn to working on controlling them to develop new treatments.



What are proteins?

Proteins are the fundamental building blocks for many components within living organisms. More specifically, they are a long sequence of subunits called amino acids, of which there are 20 kinds. Each gene specifies the specific order of the amino acids assembled to make a particular kind of protein. When this sequence of amino acids is constructed by the organism, it tangles or folds into a very particular shape. The shape (structure) of the protein determines its function.

Proteins can be small or large, sometimes containing thousands of atoms. When you eat food containing proteins, your digestive system breaks them down into their constituent amino acids so they are available for your cells to make new proteins according to your genetic codes.

Learn more about proteins at https://www.worldcommunitygrid.org/research/proteome/details.do



What is protein folding?

Proteins are chemical compounds consisting of a chain of smaller compounds called amino acids. There are 20 different kinds of amino acids. A gene specifies the order in which the kinds of amino acids are to be linked in the chain to form the protein. As the protein chain emerges from the assembly machinery, it starts to fold or tangle up into a very specific shape. Some amino acids have electrical charge patterns on their surface which make some attract to each other and some repel. These patterns of charges could be thought of as weak little bar magnets. This makes various parts of the chain of amino acids prefer to stick to certain others, and thus form a very specific structure. In the cell, there are other proteins which sometimes help guide the folding process so that the proteins do not fold into incorrect shapes. 

Given that some proteins can be so large as to contain thousands of amino acids, it can be very difficult to figure out the final shape of the folded chain of amino acids, and from there to determine the shape or structure of the protein. Knowing the shape is important because this determines the function or role of the protein. One way of finding the shape is to get many of the same protein molecules and try to make them crystalize into a regular shape (as salt might turn into cubic crystals from a brine solution). Then, x-rays are shown through the crystal. As the x-rays pass the atoms in the proteins, they deflect (diffract) in a very particular pattern which can be deciphered using mathematical techniques to finally know the positions of the atoms and the structure of the protein. However, it can be difficult, if not impossible, to get the proteins to crystalize. 

Scientists have turned to computers as an alternate way of discovering the structures of proteins. They use software programs such as Rosetta (originally developed by David Baker's lab at the University of Washington and now with collaborators from other academic institutions including New York University) to simulate the protein folding process. The software attempts to fold the amino acid sequence many different ways, trying to find the lowest energy configuration, which should represent the actual structure of the folded protein. For very large proteins, they use methods which work on portions of the protein at a time and then assemble the portions.



How does understanding protein structure help scientists understand the role of bacteria in human health?

Different proteins can have many different structures (shapes). They can have sticky portions which like to attach to certain chemical compounds, or repel them. They can sometimes be flexible. For example, enzymes are proteins which can enhance certain chemical reactions or even cut proteins or other compounds apart. Their structures and the patterns of electrical charges on their surface determine which other compounds they may interact with and how they may alter the other compounds.

Since cells use proteins for most of their basic life processes, their functions, determined by their structure, are very important. If these functions are disturbed, diseases can result. Since the human microbiome has over 1,000 times more different kinds of proteins than the human body, many of those have the potential for affecting human cells' operations. Some of these proteins can be beneficial or even necessary, while others may be harmful. This is an area which still needs a lot of exploration. One of the early steps in understanding the role of the microbiome is to discover the functions of its proteins. That requires discovering the structure of those proteins.



Where can I learn more?

The Invisible Universe of the Human Microbiome (National Public Radio video): https://www.youtube.com/watch?v=5DTrENdWvvM

Rob Knight: How our microbes make us who we are (TED Talk video): https://www.youtube.com/watch?v=i-icXZ2tMRM

How Many Cells Are in the Human Body—And How Many Microbes? (National Geographic): http://news.nationalgeographic.com/2016/01/160111-microbiome-estimate-count-ratio-human-health-science/

Wikipedia: https://en.wikipedia.org/wiki/Microbiota