FreeContact: fast and free software for protein contact prediction from residue co-evolution. BMC Bioinformatics 2014 Mar 26;15:85
Date
03/29/2014Pubmed ID
24669753Pubmed Central ID
PMC3987048DOI
10.1186/1471-2105-15-85Scopus ID
2-s2.0-84899072164 (requires institutional sign-in at Scopus site) 132 CitationsAbstract
BACKGROUND: 20 years of improved technology and growing sequences now renders residue-residue contact constraints in large protein families through correlated mutations accurate enough to drive de novo predictions of protein three-dimensional structure. The method EVfold broke new ground using mean-field Direct Coupling Analysis (EVfold-mfDCA); the method PSICOV applied a related concept by estimating a sparse inverse covariance matrix. Both methods (EVfold-mfDCA and PSICOV) are publicly available, but both require too much CPU time for interactive applications. On top, EVfold-mfDCA depends on proprietary software.
RESULTS: Here, we present FreeContact, a fast, open source implementation of EVfold-mfDCA and PSICOV. On a test set of 140 proteins, FreeContact was almost eight times faster than PSICOV without decreasing prediction performance. The EVfold-mfDCA implementation of FreeContact was over 220 times faster than PSICOV with negligible performance decrease. EVfold-mfDCA was unavailable for testing due to its dependency on proprietary software. FreeContact is implemented as the free C++ library "libfreecontact", complete with command line tool "freecontact", as well as Perl and Python modules. All components are available as Debian packages. FreeContact supports the BioXSD format for interoperability.
CONCLUSIONS: FreeContact provides the opportunity to compute reliable contact predictions in any environment (desktop or cloud).
Author List
Kaján L, Hopf TA, Kalaš M, Marks DS, Rost BAuthor
David S. Marks MD Vice Chair, Professor in the Medicine department at Medical College of WisconsinMESH terms used to index this publication - Major topics in bold
AlgorithmsComputational Biology
Protein Conformation
Proteins
Sequence Analysis, Protein
Software