RNA at the epicenter of human development (#101)
It appears that the genomic programming of humans and other complex organisms has been misunderstood for the past 50 years, because of the incorrect assumption that most genetic information, including regulatory information, is transacted by proteins. Derived assumptions, such as the presumed “explosive” (i.e., factorial) scaling of regulatory options by “combinatoric interactions” between regulatory proteins are not only unjustified theoretically and mechanistically, but are clearly incorrect on the empirical evidence. Surprisingly, the human genome contains only about 23,000 protein-coding genes, similar in number and with largely orthologous functions as those in other animals, including developmentally simple nematodes and sponges. On the other hand, the extent of non-protein-coding DNA increases with increasing developmental and cognitive complexity, reaching 98.5% in humans. Moreover, high throughput analyses have shown that the vast majority of the human genome is dynamically transcribed to produce a previously hidden world of different classes of small and large, overlapping and interlacing intronic, intergenic and antisense non-protein-coding RNAs. The transcriptome is in fact far more complex than the genome, which is best viewed as a zip file that is unpacked in highly stage- and cell-specific patterns during development. This is illustrated by the use of targeted RNA sequencing to reveal thousands of previously unknown exons and spliced isoforms of oncogenes and tumor suppressors, as well as at least 1500 new long noncoding RNA (lncRNA) genes in intergenic GWAS regions associated with complex diseases. The functions of lncRNAs are varied and include a number of widely expressed lncRNAs that play central roles in the formation of differentiation-specific subnuclear organelles. However, recent evidence suggests that their main function of the tens of thousands of highly cell-specific lncRNAs is to dynamically organize chromosome territories and guide chromatin-modifying complexes to their sites of action, to specify the architectural trajectories of development. Moreover, this system has subsequently evolved plasticity, via an as-yet-unexplored universe of retrotransposon expression and mobilization, as well as RNA editing and modification, which appears to be the molecular basis of environmental-epigenome interactions and brain function