The Human Proteome Folding project Publishes Paper in Genome Research


The Human Proteome Folding project researchers have published a paper in the journal Genome Research, which announces the availability of their data base of predicted protein structures, their validation methods and how this augments other information about these proteins, thus helping to solve a critical problem for biologists.



Paper Title:

"The proteome folding project: Proteome-scale prediction of structure and function"

Lay Person Abstract:

Lack of information about the structure of proteins is a critical problem for biologists and severely limits their ability to do further research and conduct experiments to understand the roles of proteins in disease processes. The researchers for the Human Proteome Projects have published a paper in Genome Research entitled "The proteome folding project: proteome-scale prediction of structure and function." The paper describes how they were able to use the computation results from World Community Grid to predict protein structure and protein function. Protein structure determines the function of proteins in life processes. Knowing the structure of these proteins helps scientists studying biological and medical processes and can, for example, hasten the process of discovering treatments for diseases. The human genome as well as 93 other genomes of importance to humans were processed. The paper describes the methods used to validate the accuracy of their predictions, which are now publicly available in a data base for all scientists to use.

Technical Abstract:

The incompleteness of proteome structure and function annotation is a critical problem for biologists and, in particular, severely limits interpretation of high-throughput and next-generation experiments. We have developed a proteome annotation pipeline based on structure prediction, where function and structure annotations are generated using an integration of sequence comparison, fold recognition, and grid-computing-enabled de novo structure prediction. We predict protein domain boundaries and three-dimensional (3D) structures for protein domains from 94 genomes (including human, Arabidopsis, rice, mouse, fly, yeast, Escherichia coli, and worm). De novo structure predictions were distributed on a grid of more than 1.5 million CPUs worldwide (World Community Grid). We generated significant numbers of new confident fold annotations (9% of domains that are otherwise unannotated in these genomes). We demonstrate that predicted structures can be combined with annotations from the Gene Ontology database to predict new and more specific molecular functions.

Access to Paper:

To view the paper, please click here.