|Human Proteome Folding continues to yield important data
for disease research
By Richard Bonneau, Assistant Professor, Biology and Computer Science Departments,
New York University, Center for Comparative Functional Genomics
Work on the first phase of the Human Proteome Folding project is drawing to a close. Nearly all the calculations on World Community Grid are complete, and a team of researchers at the Institute for Systems Biology (ISB) is now working around the clock (partly due to the fact that the project involves researchers around the globe, and partly due to our drive to publish) to process the results and get them out to other researchers and the volunteer community that contributed so greatly to this project.
Our first publication of results will center on a model-organism -- yeast. Yeast has been a widely studied organism for over a century and due to this and the ease of working with yeast it has become a work-horse in biology. Much of what we know about molecular biology and the function of proteins was first discovered in yeast. For this reason yeast is called a "model-organism."
After we have debugged the process using yeast we will quickly release the remaining 100 organisms folded during the project. These databases will contain sequence-based analysis of the proteins, predictions about how the proteins are organized into domains, and structure and function predictions for the protein domains folded as part of this project.
While all this is going on, IBM is helping us to further model an important subset of the proteins as part of a second phase of the project that also will run on World Community Grid.
The proposed second phase of the Human Proteome Folding project will launch on World Community Grid in April 2006. The two main objectives are to: 1) obtain higher resolution structures for specific human proteins and pathogen proteins and 2) further explore the limits of protein structure prediction by further developing Rosetta software structure prediction. Thus, the project will address two very important parallel imperatives, one biological and one biophysical.
The project will refine, using Rosetta software in a mode that accounts for greater atomic detail, the structures resulting from the first phase of the project. During the first phase, we aimed to understand protein function. During the second phase, our goal is to increase the resolution of a select subset of human proteins. Better resolution is important for a number of applications, including but not limited to virtual screening of drug targets with docking procedures and protein design. The second phase of the project also will serve to improve our understanding of the physics of protein structure and advance the state of the art in protein structure prediction (helping us to further develop our program, Rosetta).
The project will focus on human-secreted proteins (proteins in the blood and the spaces between cells). These proteins can be important for signaling between cells and are often key markers for diagnosis. These proteins have even ended up being useful as drugs (when synthesized and given by doctors to people lacking the proteins). The simplest example of a human secreted protein turned into a therapeutic is insulin; another example is human growth hormone. Understanding the function of human secreted proteins may help researchers discover the function of proteins of unknown function in the blood.
The project also will focus on key secreted pathogenic proteins. We are still in the early design phases of this part of the project, but we will likely focus on Plasmodium, the pathogenic agent that causes malaria. We hope that higher resolution structure predictions for the proteins that malaria secretes will serve as bioinformatics infrastructure for researchers who are working hard around the world to understand the complex interaction between human hosts and malaria parasites. While there are few silver bullets, and biology is one of the most complicated subjects on earth, we believe that this work will help us understand elements of this host-pathogen interaction or at least its components. This understanding could then be a foundation for intervention.
Lastly, this project dovetails with efforts at the ISB to support predictive, preventative and personalized medicine (under the assumption that these secreted proteins will be key elements of this medicine of the future). It is too early to say which proteins will end up being biomarkers, which are substances sometimes found in an increased amount in the blood, other body fluids, or tissues and which can be used to indicate the presence of some types of cancer. However, it is clear that many will end up being secreted proteins. As in the first phase of the project, the power of World Community Grid will be critical in getting results quickly to researchers in the biological and biomedical communities.