Novel approach for selecting the best predictor for identifying the binding sites in DNA binding proteins (#263)
DNA-binding proteins (DBPs) play vital roles in many cellular processes by the interactions of amino acids with DNA. Because of the experimental difficulties for getting structures and the exponential increase in the gap between the available sequences and structures of DBPs, computational methods were developed to predict DNA interacting residues from protein sequence. But their performance varies which mainly depends on training dataset, feature selection and learning capacity. Hence, it is important to reveal the correspondence between the performance of methods and properties of DBPs.
To address this problem, we have collected all available DNA binding sites prediction methods and revealed their performances on unbiased, stringent and diverse datasets for DBPs with 25% sequence identity based on various aspects: i) structural class, ii) fold, iii) superfamily, iv) family, v) binding motif vi) DNA strand, vii) conformation of DNA and viii) protein function. We observed that the best performing methods for each of the datasets showed significant biases toward the datasets selected for their benchmark. We also analyzed the performance of methods for the disordered regions, structures which are not included in the training dataset and recently solved structures. The reliability is better than randomly choosing any method or combination of methods. Our analysis revealed important features, which could be used to estimate these context specific biases and hence suggest the best method to be used for a given problem (http://www.biotech.iitm.ac.in/DNA-protein).1 2
- Nagarajan, R., Ahmed, S. and Gromiha, M. M. (2013) Novel approach for selecting the best predictor for identifying the binding sites in DNA binding proteins. Nucleic Acids Research,41 (16), 7606-7614.
- Gromiha, M. M. and Nagarajan, R. (2013) Computational approaches for predicting the binding sites and understanding the recognition mechanism of protein-DNA complexes.Advances in Protein Chemistry and Structural Biology, 91, 65-99.