The research team expands to advance their analysis of the millions of protein-crystallization images processed by World Community Grid volunteers. This will help scientists understand how protein structure can lead to better cancer drug design.
Since you completed your calculations for Help Conquer Cancer (HCC) in 2013, we have begun analyzing the results you generated. Here, we provide an update on that analysis work as next steps to publish our findings and make the data publicly available.
Biologists and medical researchers use the three-dimensional (3D) structure of proteins to design drugs and understand protein function. Solving a protein's 3D structure requires a long and difficult sequence of steps. The protein needs to be made into a pure crystal (like you might do to crystalize sugar by slowly evaporating sugar water.) Then X-rays are shown through the crystal, and the neat array of protein molecules in the crystal creates a pattern on the film which can be analyzed mathematically to ascertain the structure of each protein molecule. Unlike sugar, protein is notoriously difficult to crystallize. HCC addressed this bottleneck in the pipeline: with a method for recognizing successfully formed protein crystals in images taken from a very large number of automated experimental attempts. For HCC, World Community Grid volunteers analyzed hundreds of millions of these images, but these results need to be processed further in order to generate reliable automatic image classifiers, discover trends in data, and ultimately improve our understanding how proteins form crystals. Our analysis work is in progress, and there are some exciting results we will be reporting on next time.
Additionally, over the last year we have devoted considerable energy and resources to our new project on World Community Grid - Mapping Cancer Markers (MCM), and other cancer-gene-signature projects that our research group is involved in. To help with both priorities and directions, our team expanded and we have a new Post-Doctoral Fellow (Dr. Lisa Yan) helping us with advancing our HCC research.
Publishing our results and findings
We have not yet decided the time-frame or the exact form of how we will make the HCC data you generated available to the public. Thanks to World Community Grid volunteers, our project's terabytes of raw image data have been transformed into terabytes of computed image features (morphological image properties used in automated image classification). The identity of proteins in the crystallization trials is largely unknown to us and partially unknown even to the Hauptman Woodward Institute (HWI), the source of the images. The features we have computed do not directly relate to crystallization outcomes or human-understandable image labels. A classifier is required to translate computed features to meaningful human labels or experimental outcomes. We have trained multiple image classifiers so far, but are confident that we can improve them. It is essential (and practical) that we finish this part of research, and publish our findings before releasing the useful data.
The Grid-computed results of Help Conquer Cancer have yet to be fully analyzed. Once complete, we intend to publish one or more papers based on the analysis, but cannot currently estimate a time-frame.
The High-Throughput Screening Lab at HWI supplied the original protein-crystallization image data, and indeed continues to generate more. Both HWI and the scientists who send them protein samples will benefit from the HCC research in two ways: better systems for automatically classifying protein-crystallization images (saving time and manual labour), and better understanding of the protein crystallization process.