Nutritious Rice for the World project publishes paper in BioMed Central Research Notes

The Nutritious Rice for the World researchers developed a way to speed postprocessing of the results computed by World Community Grid. They published a paper describing how this was done using Graphical Processing Units, normally used to in computers for games and entertainment.

Paper Title:

"GPU-Q-J, a fast method for calculating root mean square deviation (RMSD) after optimal superposition"

Lay Person Abstract:

The postprocessing of the information computed using World Community Grid about the structures of 62,000 rice proteins requires a computationally intensive process, called clustering. This clustering computation determines the most probable structure for a given protein. 100,000 structure predictions were computed for each protein. The clustering algorithm finds the most likely structure for the protein based on which and how many of the 100,000 predicted structures most closely match each other. To dramatically speed up this postprocessing step, the researchers developed a way to take advantage of Graphical Processing Units (GPU's), used in many new computers to speed up processing required to render images for games and other entertainment. This paper describes the approach they used to speed up the postprocessing of the results of the Nutritious Rice for the World project. The scientists were able to increase the speed of postprocessing by over a factor of 260, which will help them accelerate the next steps of their research.

Technical Abstract:

Calculation of the root mean square deviation (RMSD) between the atomic coordinates of two optimally superposed structures is a basic component of structural comparison techniques. We describe a quaternion based method, GPU-Q-J, that is stable with single precision calculations and suitable for graphics processor units (GPUs). The application was implemented on an ATI 4770 graphics card in C/C++ and Brook+ in Linux. The Nutritious Rice for the World Project (NRW) on World Community Grid predicted de novo, the structures of over 62,000 small proteins and protein domains returning a total of 10 billion candidate structures. Clustering ensembles of structures on this scale requires calculation of large similarity matrices consisting of RMSDs between each pair of structures in the set. As a real-world test, we calculated the matrices for 6 different ensembles from NRW. The GPU method was 260 times faster that the fastest existing CPU based method and over 500 times faster than the method that had been previously used. GPU-Q-J is a significant advance over previous CPU methods. It relieves a major bottleneck in the clustering of large numbers of structures for NRW. It also has applications in structure comparison methods that involve multiple superposition and RMSD determination steps, particularly when such methods are applied on a proteome and genome wide scale.

Access to Paper:

To review the paper, please click here.