About the Project


Human Proteome Folding Project: A layperson's Explanation

Proteins are essential to living beings. Just about everything in the human body involves or is made out of proteins.

What are proteins?

Proteins are large molecules that are made of long chains of smaller molecules called amino acids. While there are only 20 different kinds of amino acids that make up all proteins, sometimes hundreds of them make up a single protein.

Adding to the complexity, proteins typically do not stay as long chains. As soon as the chain of amino acids is built, the chain folds and tangles up into a more compact and particular shape that lets it conduct specific and necessary functions within the human body.

Proteins fold because the different amino acids like to stick to each other following certain rules. Imagine that amino acids are pop-beads of 20 different colors. The pop-beads are sticky, but sticky in such a way that only certain combinations of colors can stick together. This makes the amino acid chains fold in a particular way that creates proteins that are useful to the human body. Human cells have mechanisms to help the proteins fold properly and, equally important, mechanisms to get rid of improperly folded proteins.

How do proteins relate to human genes?

The collection of all of the human genes is known as "the human genome." Depending on how the genes are counted, there are over 30,000 genes in the human genome. Each gene, which is a section of a long chain known as DNA, dictates how to build the chain of amino acids for one of the 30,000 proteins. In recent years, scientists were able to map the sequence for each human gene. This means that we now know the sequence of amino acids in all of the human proteins. Thus, the human genome is directly related to the "human proteome," the collection of all human proteins.

The protein mystery

While researchers have learned a great deal about the human proteome, the functions of most of the proteins remain a mystery. The genes do not reveal exactly how the proteins will fold into their final shape, which is critical because that determines what a protein can do and what other proteins it can connect to or interact with.

Proteins are like puzzle pieces. For example, muscle proteins connect to each other to form a muscle fiber. They join together in a specific manner because of their shape, as well as other factors relating to the shape.

Everything that goes on in cells and in the body is very specifically controlled by the shape of the proteins that do or do not let proteins interlock with other proteins. For example, the proteins of a virus or bacteria may have particular shapes that enable it to break through the cell membrane, allowing it to infect the cell.

The Human Proteome Folding Project

Knowing the shapes of proteins will help researchers understand how proteins perform their desired functions and also how diseases prevent proteins from doing their necessary functions to maintain healthy cells.

The Human Proteome Folding Project will combine the power of millions of computers in a grid to help scientists understand how human proteins fold. The work to be done in this monumental task is shared across this grid, so that results can be achieved far sooner than would be possible with conventional supercomputers. With a greater understanding of protein structure, scientists can learn how diseases work and ultimately find cures for them.

When your grid agent is running, it is folding an amino acid chain in various ways and evaluating how well each folding follows the specific rules of how specific amino acids stick together or not. As computers try millions of ways to fold the chains, they attempt to fold the protein in the same way that it actually folds in the human body. The best shapes identified for each protein are returned to the scientists for further study.

Understanding your agent application window

Click on the "i" on your agent application window in the lower right hand corner.

The name of the computer program is Rosetta. It computes a "Rosetta score" that tells how properly folded a protein is as the program tries different foldings. To compute this score, the program considers the packing of amino acids within the protein according to many scoring rules. The lower (more toward the negative) the scores are, the better the folding.

The Rosetta score for the best folding that your computer has identified for a particular protein is shown as the "Min" value under the current Rosetta Score in the left half of your agent's application window. You can see snapshot pictures of the partially folded protein that your computer is working on in the right half of the window. The left side shows two other numbers that tell how properly folded the protein is thus far. The Environment score shows how well the central core of the protein has been packed together. The Pair score tells how well certain amino acids have been paired up with the correct counterparts. If a trial fold gets a worse score, then the Rosetta program tries to refold the protein a different way to see if it produces a better score. This is done millions of times for each protein. Scientists will look at the best scoring protein structures and use those in the next steps of their research.