Whole genome sequence and analysis of the Marwari horse breed and its genetic origin — ASN Events

Whole genome sequence and analysis of the Marwari horse breed and its genetic origin (#49)

JeHoon Jun 1 , Yun Sung Cho 1 , Haejin Hu 1 , Hak-Min Kim 1 , Sungwoong Jho 1 , Priyvrat Gadhvi 1 , Kyung Mi Park 2 , Jeongheui Lim 3 , Woon Kee Paek 3 , Kyudong Han 4 5 , Andrea Manica 6 , Jeremy S Edwards 7 , Jong Bhak 2 8 9 10
  1. Personal Genomics Institute, Genome Research Foundation, Suwon, Republic of Korea
  2. Theragen BiO Institute, TheragenEtex, Suwon, Republic of Korea
  3. National Science Museum, Daejeon, Republic of Korea
  4. Department of Nanobiomedical Science & BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan, Republic of Korea
  5. DKU-Theragen institute for NGS analysis (DTiNa), TheragenEtex, Cheonan, Republic of Korea
  6. Evolutionary Ecology Group, Department of Zoology, University of Cambridge, Cambridge, UK
  7. Department of Chemistry and Chemical Biology, Department of Molecular Genetics and Microbiology, Department of Chemical and Nuclear Engineering, Cancer Research and Treatment Center, University of New Mexico, Albuquerque, NM, USA
  8. Personal Genomics Institute, Genome Research Foundation, Suwon, Korea
  9. Advanced Institutes of Convergence Technology Nano Science and Technology, Suwon, Republic of Korea
  10. Program in Nano Science and Technology, Department of Transdisciplinary Studies, Seoul National University, Suwon, Republic of Korea


Background

The horse (Equus ferus caballus) is one of the earliest domesticated species and has played numerous important roles in human societies over the past 5,000 years. In this study, we characterized the genome of the Marwari horse, a rare breed with certain unique characteristics such as inwardly turned ear tips. The breed is thought to have arisen from breeding local Indian ponies with Arabian horses beginning in the 12th century.

Results

We generated 101 Gb (~30´ coverage) of whole genome sequences from a Marwari horse using the Illumina HiSeq2000 sequencer. The sequences were mapped to the horse reference genome at a mapping rate of ~98% and with ~95% of the genome having at least 10´ coverage. A total of 5.9 million single nucleotide variations and 0.6 million small insertions or deletions were identified. We confirmed a strong Arabian and Mongolian component in the Marwari genome. Novel variants from the Marwari sequences were annotated, and were found to be enriched in olfactory functions. Additionally, we suggest a potential functional genetic variant in the TSHZ1 gene (p.Ala344>Val) associated with the inward-turning ear tip shape of the Marwari horses.

Conclusions

Here, we present an analysis of the Marwari horse genome. This is the first genomic data for an Asian breed, and is an invaluable resource for future studies of genetic variation associated with phenotypes and diseases in horses.

  1. Wade CM, Giulotto E, Sigurdsson S, Zoli M, Gnerre S, Imsland F, Lear TL, Adelson DL, Bailey E, Bellone RR, Blocker H, Distl O, Edgar RC, Garber M, Leeb T, Mauceli E, MacLeod JN, Penedo MC, Raison JM, Sharpe T, Vogel J, Andersson L, Antczak DF, Biagi T, Binns MM, Chowdhary BP, Coleman SJ, Della Valle G, Fryc S, Guerin G, et al: Genome sequence, comparative analysis, and population genetics of the domestic horse. Science 2009, 326:865-867.
  2. Warmuth V, Eriksson A, Bower MA, Barker G, Barrett E, Hanks BK, Li S, Lomitashvili D, Ochir-Goryaeva M, Sizonov GV, Soyonov V, Manica A: Reconstructing the origin and spread of horse domestication in the Eurasian steppe. Proc Natl Acad Sci U S A 2012, 109:8202-8206.
  3. Warmuth V, Eriksson A, Bower MA, Cañon J, Cothran G, Distl O, Glowatzki-Mullis ML, Hunt H, Luís C, do Mar Oom M, Yupanqui IT, Ząbek T, Manica A: European Domestic Horses Originated in Two Holocene Refugia. PLoS One 2011, 6:e18194.
  4. Doan R, Cohen ND, Sawyer J, Ghaffari N, Johnson CD, Dindot SV: Whole-Genome Sequencing and Genetic Variant Analysis of a Quarter Horse Mare. BMC Genomics 2012, 13:78.
  5. Online Mendelian Inheritance in Animals [http://omia.angis.org.au/home]
  6. Orlando L, Ginolhac A, Zhang G, Froese D, Albrechtsen A, Stiller M, Schubert M, Cappellini E, Petersen B, Moltke I, Johnson PL, Fumagalli M, Vilstrup JT, Raghavan M, Korneliussen T, Malaspinas AS, Vogt J, Szklarczyk D, Kelstrup CD, Vinther J, Dolocan A, Stenderup J, Velazquez AM, Cahill J, Rasmussen M, Wang X, Min J, Zazula GD, Seguin-Orlando A, Mortensen C, et al: Recalibrating Equus evolution using the genome sequence of an early Middle Pleistocene horse. Nature 2013, 499:74-81.
  7. Hendricks B: International Encyclopedia of Horse Breeds. Norman: University of Oklahoma Press; 1995.
  8. Dutson, Judith: Storey's Illustrated Guide to 96 Horse Breeds of North America. North adams: Storey Publishing; 2005.
  9. Gupta AK, Chauhan M, Tandon SN; Sonia: Genetic diversity and bottleneck studies in the Marwari horse breed. J Genet 2005, 84: 295-301.
  10. Elwyn Hartley Edwards: The Encyclopedia of the Horse. New York: Dorling Kindersley; 1994.
  11. Wendy Doniger: The Hindus: An Alternative History. New Delhi: Penguin Books; 2009.
  12. Behl R, Behl J, Gupta N, Gupta SC: Genetic relationships of five Indian horse breeds using microsatellite markers. Animal 2007, 4:483-488.
  13. Petersen JL, Mickelson JR, Rendahl AK, Valberg SJ, Andersson LS, Axelsson J, Bailey E, Bannasch D, Binns MM, Borges AS, Brama P, da Câmara Machado A, Capomaccio S, Cappelli K, Cothran EG, Distl O, Fox-Clipsham L, Graves KT, Guérin G, Haase B, Hasegawa T, Hemmann K, Hill EW, Leeb T, Lindgren G, Lohi H, Lopes MS, McGivney BA, Mikko S, Orr N, et al: Genome-wide analysis reveals selection for important traits in domestic horse breeds. PLoS Genet 2013, 9:e1003211.
  14. Huang da W, Sherman BT, Zheng X, Yang J, Imamichi T, Stephens R, Lempicki RA: Extracting biological meaning from large gene lists with DAVID. Curr Protoc Bioinformatics 2009, Chapter 13:Unit 13.11.
  15. John F. Wall: Famous Running Horses: Their Forebears and Descendants. Whitefish: Literary Licensing; 2013.
  16. Robert Moorman Denhardt: The Quarter Horse Running: America’s Oldest Breed. Norman: University of Oklahoma Press; 2003
  17. Llamas: This is the Spanish Horse. London: J A Allen & Co Ltd; 1999.
  18. Milner: Godolphin Arabian: Story of the Matchem Line. London: J. A. Allen; 1990.
  19. Breed of Livestock [http://www.ansi.okstate.edu/breeds/horses/]
  20. International Museum of the HORSE [http://www.imh.org/exhibits/online/breeds-of-the-world]
  21. Falush D, Stephens M, Pritchard JK: Inference of population structure using multilocus genotype data: linked loci and1 correlated allele frequencies. Genetics 2003, 164:1567-1587.
  22. Pritchard JK, Stephens M, Donnelly P: Inference of population structure using multilocus genotype data. Genetics 2000, 155:945-959.
  23. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR: A method and server for predicting damaging missense mutations. Nat Methods 2010, 7:248-249.
  24. ALTMANN F: Congenital atresia of the ear in man and animals. Ann Otol Rhinol Laryngol 1955, 64:824-858
  25. Yellon RF, Branstetter BF 4th: Prospective blinded study of computed tomography in congenital aural atresia. Int J Pediatr Otorhinolaryngol 2010, 74:1286-1291.
  26. Coré N, Caubit X, Metchat A, Boned A, Djabali M, Fasano L: Tshz1 is required for axial skeleton, soft palate and middle ear development in mice. Dev Biol 2007, 308:407-420.
  27. Hill EW, Gu J, McGivney BA, MacHugh DE: Targets of selection in the Thoroughbred genome contain exercise-relevant gene SNPs associated with elite racecourse performance. Anim Genet 2010, 41:56-63.
  28. Bellone RR, Forsyth G, Leeb T, Archer S, Sigurdsson S, Imsland F, Mauceli E, Engensteiner M, Bailey E, Sandmeyer L, Grahn B, Lindblad-Toh K, Wade CM: Fine-mapping and mutation analysis of TRPM1: a candidate gene for leopard complex (LP) spotting and congenital stationary night blindness in horses. Brief Funct Genomics 2010, 9:193-207.
  29. Tryon RC, White SD, Bannasch DL: Homozygosity mapping approach identifies a missense mutation in equine cyclophilin B (PPIB) associated with HERDA in the American Quarter Horse. Genomics 2007, 90:93-102.
  30. Brooks SA, Gabreski N, Miller D, Brisbin A, Brown HE, Streeter C, Mezey J, Cook D, Antczak DF: Whole-genome SNP association in the horse: identification of a deletion in myosin Va responsible for Lavender Foal Syndrome. PLoS Genet 2010, 6:e1000909.
  31. Marklund L, Moller MJ, Sandberg K, Andersson L: A missense mutation in the gene for melanocyte-stimulating hormone receptor (MC1R) is associated with the chestnut coat color in horses. Mamm Genome 1996, 7:895-899.
  32. Wagner HJ, Reissmann M: New polymorphism detected in the horse MC1R gene. Anim Genet 2000, 31:289-290.
  33. Brooks SA, Bailey E: Exon skipping in the KIT gene causes a Sabino spotting pattern in horses. Mamm Genome 2005, 16:893-902.
  34. Brooks SA, Lear TL, Adelson DL, Bailey E: A chromosome inversion near the KIT gene and the Tobiano spotting pattern in horses. Cytogenet Genome Res 2007, 119:225-230.
  35. Makvandi-Nejad S, Hoffman GE, Allen JJ, Chu E, Gu E, Chandler AM, Loredo AI, Bellone RR, Mezey JG, Brooks SA, Sutter NB: Four loci explain 83% of size variation in the horse. PLoS One 2012, 7:e39929.
  36. Signer-Hasler H, Flury C, Haase B, Burger D, Simianer H, Leeb T, Rieder S: A genome-wide association study reveals loci influencing height and other conformation traits in horses. PLoS One 2012, 7:e37282.
  37. Spirito F, Charlesworth A, Linder K, Ortonne JP, Baird J, Meneguzzi G: Animal models for skin blistering conditions: absence of laminin 5 causes hereditary junctional mechanobullous disease in the Belgian horse. J Invest Dermatol 2002, 119:684-691.
  38. Brunberg E, Andersson L, Cothran G, Sandberg K, Mikko S, Lindgren G: A missense mutation in PMEL17 is associated with the Silver coat color in the horse. BMC Genet 2006, 7:46.
  39. Graves KT, Henney PJ, Ennis RB: Partial deletion of the LAMA3 gene is responsible for hereditary junctional epidermolysis bullosa in the American Saddlebred Horse. Anim Genet 2009, 40:35-41.
  40. Shin EK, Perryman LE, Meek K: A kinase-negative mutation of DNA-PK(CS) in equine SCID results in defective coding and signal joint formation. J Immunol 1997, 158:3565-3569.
  41. Aleman M, Riehl J, Aldridge BM, Lecouteur RA, Stott JL, Pessah IN: Association of a mutation in the ryanodine receptor 1 gene with equine malignant hyperthermia. Muscle Nerve 2004, 30:356-365.
  42. Gu J, MacHugh DE, McGivney BA, Park SD, Katz LM, Hill EW: Association of sequence variants in CKM (creatine kinase, muscle) and COX4I2 (cytochrome c oxidase, subunit 4, isoform 2) genes with racing performance in Thoroughbred horses. Equine Vet J 2010, 42:569-75.
  43. McCue ME, Valberg SJ, Miller MB, Wade C, DiMauro S, Akman HO, Mickelson JR: Glycogen synthase (GYS1) mutation causes a novel skeletal muscle glycogenosis. Genomics 2008, 91:458-466.
  44. Cannon SC, Hayward LJ, Beech J, Brown RH Jr: Sodium channel inactivation is impaired in equine hyperkalemic periodic paralysis. J Neurophysiol 1995, 73:1892-1899.
  45. Orr N, Back W, Gu J, Leegwater P, Govindarajan P, Conroy J, Ducro B, Van Arendonk JA, MacHugh DE, Ennis S, Hill EW, Brama PA: Genome-wide SNP association-based localization of a dwarfism gene in Friesian dwarf horses. Anim Genet 2010, 41:2-7.
  46. Cook D, Brooks S, Bellone R, Bailey E: Missense mutation in exon 2 of SLC36A1 responsible for champagne dilution in horses. PLoS Genet 2008, 4:e1000195.
  47. Hansen M, Knorr C, Hall AJ, Broad TE, Brenig B: Sequence analysis of the equine SLC26A2 gene locus on chromosome 14q15-->q21. Cytogenet Genome Res 2007, 118:55-62.
  48. Yang GC, Croaker D, Zhang AL, Manglick P, Cartmill T, Cass D: A dinucleotide mutation in the endothelin-B receptor gene is associated with lethal white foal syndrome (LWFS); a horse variant of Hirschsprung disease. Hum Mol Gene 1998, 7:1047-1052.
  49. Hill EW, McGivney BA, Gu J, Whiston R, Machugh DE: A genome-wide SNP association study confirms a sequence variant (g.66493737C > T) in the equine myostatin (MSTN) gene as the most powerful predictor of optimum racing distance for Thoroughbred racehorses. BMC Genomics 2010, 11:552.
  50. Mariat D, Taourit S, Guérin G: A mutation in the MATP gene causes the cream coat colour in the horse. Genet Sel Evol, 2003, 35:119-133.
  51. Rieder S, Taourit S, Mariat D, Langlois B, Guérin G: Mutations in the agouti (ASIP), the extension (MC1R), and the brown (TYRP1) loci and their association to coat color phenotypes in horses (Equus caballus). Mamm Genome 2001, 12:450-455.
  52. Andersson LS, Larhammar M, Memic F, Wootz H, Schwochow D, Rubin CJ, Patra K, Arnason T, Wellbring L, Hjälm G, Imsland F, Petersen JL, McCue ME,Mickelson JR, Cothran G, Ahituv N, Roepstorff L, Mikko S, Vallstedt A, Lindgren G, Andersson L, Kullander K: Mutations in DMRT3 affect locomotion in horses and spinal circuit function in mice. Nature 2012, 488:642-646.
  53. Rosengren Pielberg G, Golovko A, Sundström E, Curik I, Lennartsson J, Seltenhammer MH, Druml T, Binns M, Fitzsimmons C, Lindgren G, Sandberg K, Baumung R, Vetterlein M, Strömberg S, Grabherr M, Wade C, Lindblad-Toh K, Pontén F, Heldin CH, Sölkner J, Andersson L: A cis-acting regulatory mutation causes premature hair graying and susceptibility to melanoma in the horse. Nat Genet 2008, 40:1004-1009.
  54. Nielsen R, Bustamante C, Clark AG, Glanowski S, Sackton TB, Hubisz MJ, Fledel-Alon A, Tanenbaum DM, Civello D, White TJ, J Sninsky J, Adams MD, Cargill M: A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS Biol 2005. 3:el70
  55. Li L, Stoeckert CJ Jr, Roos DS: OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 2003, 13:2178-2189.
  56. Patel RK, Jain M: NGS QC Toolkit: A Toolkit for Quality Control of Next Generation Sequencing Data. PLoS One 2012, 7:e30619.
  57. Li H, Durbin R: Fast and accurate short read alignment with Burrows-Wheeler Transform. Bioinformatics 2009, 25:1754-1760.
  58. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA: The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010, 20:1297-1303.
  59. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Processing Subgroup: The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics 2009, 25:2078-2079.
  60. Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM: A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly(Austin) 2012, 6:80-92
  61. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, Sham PC: PLINK: a toolset for whole-genome association and population-based linkage analysis. Amer J Hum Genet 2007, 81: 559-575.
  62. Stamatakis A: RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 2006, 22:2688-2690.
  63. Stamatakis A, Aberer AJ, Goll C, Smith SA, Berger SA, Izquierdo-Carrasco F: RAxML-Light: a tool for computing terabyte phylogenies. Bioinformatics 2012, 28:2064-2066.
  64. Tamura K, Stecher G, Peterson D, Filipski A, Kumar S: MEGA6: Molecular Evolutionary Genetics Analysis Version 6.0. Mol Biol Evol 2013, 30:2725-2729.
  65. Ihaka R, Gentleman R: R: A Language for Data Analysis and Graphics. J Comput Graph Stat 1996, 5:299-314
  66. Earl DA, Vonholdt BM: STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conserv Genet Resour 2012, 4:359-361.
  67. Rosenberg NA: DISTRUCT: a program for the graphical display of population structure. Mol Ecol Notes 2004, 4:137-138.
  68. Yim HS, Cho YS, Guang X, Kang SG, Jeong JY, Cha SS, Oh HM, Lee JH, Yang EC, Kwon KK, Kim YJ, Kim TW, Kim W, Jeon JH, Kim SJ, Choi DH, Jho S, Kim HM, Ko J, Kim H, Shin YA, Jung HJ, Zheng Y, Wang Z, Chen Y, Chen M, Jiang A, Li E, Zhang S, Hou H, et al: Minke whale genome and aquatic adaptation in cetaceans. Nat Genet 2014, 46:88-92.
  69. Ji R, Cui P, Ding F, Geng J, Gao H, Zhang H, Yu J, Hu S, Meng H: Monophyletic origin of domestic bactrian camel (Camelus bactrianus) and its evolutionary relationship with the extant wild camel (Camelus bactrianus ferus). Anim Genet 2009, 40:377-382.
  70. Yang Z: PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 2007, 24:1586-1591.
  71. Zhang J, Nielsen R, Yang Z: Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol 2005, 22:2472-2479.