David A. Bader
IEEE Fellow
AAAS Fellow
Professor
College of Computing
Georgia Tech
Atlanta, GA 30332


 
 

 

GTfold: A Scalable Multicore Code for RNA Secondary Structure Prediction

The prediction of the correct secondary structures of large RNAs is one of the unsolved challenges of computational molecular biology. Among the major obstacles is the fact that accurate calculations scale as O(n4), so the computational requirements become prohibitive as the length increases. Existing folding programs implement heuristics and approximations to overcome these limitations. We present a new parallel multicore and scalable program called GTfold, which is one to two orders of magnitude faster than the de facto standard programs and achieves comparable accuracy of prediction. Development of GTfold opens up a new path for the algorithmic improvements and application of an improved thermodynamic model to increase the prediction accuracy. In this paper we analyze the algorithmís concurrency and describe the parallelism for a shared memory environment such as a symmetric multiprocessor or multicore chip. In a remarkable demonstration, GTfold now optimally folds 11 picornaviral RNA sequences ranging from 7100 to 8200 nucleotides in 8 minutes, compared with the two months it took in a previous study. We are seeing a paradigm shift to multicore chips and parallelism must be explicitly addressed to continue gaining performance with each new generation of systems. We also show that the exact algorithms like internal loop speedup can be implemented with our method in an affordable amount of time. GTfold is freely available as open source from our website (click here).

Publication History

Versions of this paper appeared as:
  1. A. Mathuriya, D.A. Bader, C.E. Heitsch, and S.C. Harvey, ``GTfold: A Scalable Multicore Code for RNA Secondary Structure Prediction,'' Technical Report, Georgia Institute of Technology, August 23, 2008.
  2. A. Mathuriya, D.A. Bader, C.E. Heitsch, and S.C. Harvey, ``GTfold: A Scalable Multicore Code for RNA Secondary Structure Prediction,'' 24th Annual ACM Symposium on Applied Computing (SAC), Computational Sciences Track, Honolulu, HI, March 8-12, 2009.
  3. M.S. Swenson, J. Anderson, A. Ash, P. Gaurav, Z.Sukos, D.A. Bader, S.C. Harvey, and C.E. Heitsch, "GTfold: Enabling parallel RNA secondary structure prediction on multi-core desktops." BMC Research Notes. 5(1):341, 2012.

Download this report in Adobe PDF


 
 

Last updated: April 26, 2013

 




Computational Biology



Parallel Computing



Combinatorics