Predicting Host Tropism of Influenza A Virus Proteins using Random Forest — ASN Events

Predicting Host Tropism of Influenza A Virus Proteins using Random Forest (#63)

Christine LP Eng 1 , Joo Chuan Tong 1 2 , Tin Wee Tan 1
  1. Department of Biochemistry, National University of Singapore, Singapore
  2. Institute of High Performance Computing, Singapore

Background: Majority of influenza A viruses reside and circulate among animal populations, seldom infecting humans due to host range restriction. Yet when some avian strains do acquire the ability to overcome species barrier, they might become adapted to humans, replicating efficiently and causing diseases, leading to potential pandemic. With the huge influenza A virus reservoir in wild birds, it is a cause for concern when a new influenza strain emerges with the ability to cross host species barrier, as shown in light of the recent H7N9 outbreak in China. Several influenza proteins have been shown to be major determinants in host tropism. Further understanding and determining host tropism would be important in identifying zoonotic influenza virus strains capable of crossing species barrier and infecting humans.  

Results:  In this study, computational models for 11 influenza proteins have been constructed using the machine learning algorithm random forest for the prediction of host tropism. The prediction models were trained on influenza protein sequences isolated from both avian and human samples, which are transformed into amino acid physicochemical properties feature vectors. The results were highly accurate prediction models (ACC>96.57; AUC>0.980; MCC>0.916) capable of determining host tropism of individual influenza proteins. In addition, features from all 11 proteins were used to construct a combined model to predict host tropism of influenza virus strains. This would help assess a novel influenza strain’s host range capability.

Conclusions:  From the prediction models constructed, all achieved high prediction performance, indicating clear distinctions in both avian and human influenza proteins. Understanding and predicting host tropism of influenza proteins lay an important foundation for future work in constructing computation models capable of predicting interspecies transmission of influenza viruses. The prediction models are available on http://flupred.bic.nus.edu.sg