Analyzing a wealth of data about the natural world

The Uncovering Genome Mysteries project has already amassed data on over 200 million proteins, with the goal of understanding the common features of life everywhere on earth. There are tens of millions of calculations still to run, but the team is also making preparations for analysis and eventual publication of the data.


For almost a year now, Uncovering Genome Mysteries has been comparing protein sequences derived from the genomes of nearly all living organisms analyzed to date. Thanks to the volunteers that contribute computer time to World Community Grid, more than 34 million results have been returned with data on functional identification and protein similarities. Along with our collaborators in Australia, we’ve paid particular attention to microorganisms from different ecosystems, with special emphasis on marine organisms. More than 200 million proteins have been compared thus far, during the equivalent of 15,000 years of computation. The resulting data are sent to our computer servers at the Fiocruz Foundation in Rio de Janeiro, Brazil and now also to the University of New South Wales, Sydney, Australia. A last set of around 20 million protein sequences, determined over the last year, is now being added to the dataset and will be run on World Community Grid in the coming months.

However, the task of functional mapping and comparison between proteins from all these organisms does not end there. Our team of scientists is, in the meantime, investing more efforts to optimize the algorithms for further analysis and representation of the data generated by World Community Grid volunteers, and preparing for the database systems that will make the results available to the scientific community. Once our data is public, we expect that the scientific community’s understanding of the intricate network of life will gain a completely new perspective, and that results will also contribute to the development of many new applications in health, agriculture and life sciences in general.

This project is a cooperation between World Community Grid, the laboratory of Dr. Torsten Thomas and his team from the School of Biotechnology and Biomolecular Sciences & Centre for Marine Bio-Innovation at the University of New South Wales, Sydney, Australia, and our team at the Laboratory for Functional Genomics and Bioinformatics, at the Oswaldo Cruz Foundation – Fiocruz, in Brazil.

Related Articles