A new tool to avoid errors associated with the analysis of hypermutated viral sequences by the widely used Hypermut program (#71)
The human genome encodes a family of editing enzymes known as APOBEC3 (apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like3). They induce context dependent G-to-A changes in the genome of sub-populations of viruses such as HIV, SIV, HBV and endogenous retroviruses, is referred to as “hypermutation”. Hypermut is a program by the Los Alamos National Laboratories that is widely used to analyse and identify hypermutation. It is shown here that insertion/deletion in the sequences results in several different errors in this program leading to the incorrect identification of hypermutated sequences. This in turn results in erroneous biological inferences made based on the outcome of the Hypermut program.In this paper we identify and report these errors using published and unpublished viral sequences and present a new algorithm we refer to as G2A3 to avoid these errors.