Paper published in the journal Genome Biology and Evolution using Human Proteome Folding project results


Researchers have published a paper in the journal Genome Biology and Evolution, which documents their findings studying a number of plant genomes, their proteomes, evolution and protein structure.



Paper Title:

"The Plant Proteome Folding Project: Structure and Positive Selection in Plant Protein Families"

Lay Person Abstract:

Melissa Pentony et al. have presented work considering components of proteins exhibiting faster-than-average evolution in the proteomes of five major plant species, including rice (Oryza sativa) and Arabidopsis thaliana (an important model organism for plant study). They describe new information on the relationship between evolution and protein structure in plants.

The World Community Grid has contributed to this study by providing a much more structurally complete view of unknown and understudied proteins from five plant families than was previously available. The results from the Human Proteome Folding project produced 29,202 protein structures contributing to this project, of which 4,764 were very high-confidence. This should eventually assist agricultural scientists to better understand important plant and food crops, how to breed them for disease resistance, better nutrition and to better handle environmental stress.

Technical Abstract:

Despite its importance, relatively little is known about the relationship between the structure, function, and evolution of proteins, particularly in land plant species. We have developed a database with predicted protein domains for five plant proteomes (http://pfp.bio.nyu.edu/) and used both protein structural fold recognition and de novo Rosetta-based protein structure prediction to predict protein structure for Arabidopsis and rice proteins. Based on sequence similarity, we have identified ~15,000 orthologous/paralogous protein family clusters among these species and used codon-based models to predict positive selection in protein evolution within 175 of these sequence clusters. Our results show that codons that display positive selection appear to be less frequent in helical and strand regions and are overrepresented in amino acid residues that are associated with a change in protein secondary structure. Like in other organisms, disordered protein regions also appear to have more selected sites. Structural information provides new functional insights into specific plant proteins and allows us to map positively selected amino acid sites onto protein structures and view these sites in a structural and functional context.

Access to Paper:

To view the paper, please click here.