有关microRNA的研究是功能基因组时代研究的重要前沿问题之一,这方面的研究既具有重要的理论意义,对药物设计、疾病防治等问题也具有重要的实际意义。本论文结合生物信息学方法和实验验证,围绕microRNA存在的普遍性和调控的联系性逐层深入的展开研究,主要包括两个方面的六个内容:一方面是microRNA基因识别,另一方面是其与其他转录后调控事件间的关系研究。 1)本文提出了microRNA成熟体的分布特征,用以描述microRNA前体剪切加工的特性并可被直接应用改进预测算法。分析表明成熟体倾向于分布在前体上的近环区,且分布不随机,受进化选择;成熟体端部位于低稳定性区,利于剪切加工。 2)继而本文基于成熟体分布特征和成熟体两两物种的保守性开发了microRNA前体和成熟体的识别算法。成功应用于果蝇、蜜蜂、按蚊和家蚕四个无脊椎动物基因组。并对算法结果进行了RT-PCR实验验证。 3)对未知初始转录本的预测也是microRNA基因识别的重要部分。本文提出利用EST比对microRNA前体上下游序列的新策略,用信息方法识别了大量microRNA初始转录本,并构建了时序和组织表达谱。而RT-PCR实验验证表明了这种信息策略的应用是成功的。 开发和完善microRNA基因识别算法满足了发现更多新microRNA的需要,同时揭示了microRNA存在的普遍性。本文接下来研究microRNA与其他调控事件的相互作用,初步探索了调控的联系性。 4)本文研究表明有成千的基因由可变加尾介导改变3'-UTR长度来选择性地躲避microRNA的调控,与随机模型比较显示躲避事件受到进化选择。并基于EST计数估计判断躲避事件是近期出现的。 5)本文建立了第一个收录实验识别的RNA编辑位点的数据库。 6)本文研究表明大量发生于3'-UTR的RNA编辑事件潜在影响与其共表达的microRNA靶向功能。进一步分析显示此类影响从编辑效率和替换种类角度看存在差别。 关键词:microRNA;识别;转录后调控;可变加尾;RNA编辑
One of the biggest surprises at the beginning of the post-genome era was the discovery of numerous genes encoding microRNAs. Currently, microRNA has become one of the frontier problems in functional genomics studies. These studies not only have the theoretical significance but also are very helpful and informative to therapeutical and pharmaceutical researches. In this dissertation, the universality and interconnectedness of microRNA-based regulation were further investigated by using bioinformatics approaches and experimental verifications. The dissertation includes two parts and six works: computational identification of microRNA genes and exploring the connection between microRNA with other two post-transcriptional regulatory events, alternative polyadenylation and RNA editing, respectively. 1)A set of novel distribution features of mature miRNA on its precursor is proposed for depicting the cleavage of mature miRNA duplex. And this is a new character that can be applied for miRNA prediction. The results showed that the terminals of mature miRNAs tend to locate near loop or base-loop region rather than in loops or very far from loop structures (>6nt),which is quite different from expected by chance. Further more, free energies of 1-6nt flanking mature miRNAs are much higher than those of mature miRNA terminal regions. This distribution facilitates the cleavage of mature miRNA duplex. 2)We developed a computational strategy based on mature miRNA pairwise conservation and the distribution of mature miRNA on its precursor. This algorithm was successfully applied on four invertebrate genomes, Drosophila melangoster, Bombyx mori, Apis mellifera and Anopheles gambiae. And then RT-PCR verification proves that our algorithm is efficient. 3)Tens of pri-miRNAs has been successfully predicted by mapping the ESTs to the long flanking sequences of precursor miRNA. Using the flanking-EST method, the temporal and tissue expression profiles are quickly and simply established for hundreds of miRNAs in different tissues. RT-PCR experiments were conducted to verify the transcription and expression of the pri-miRNAs In order to discover numerous novel microRNAs and uncover the universality of microRNA-based regulation, it is necessary to develop new computational prediction methods. Subsequently, we studied the crosstalk between microRNA and other regulatory events,and initially explored the interconnectedness of microRNA-based regulation. 4)We found that alternative polyadenylation,a mechanism which commonly influences 3'-UTR length, tends to remove putative miRNA target site(s) in mammalian genomes, thereby producing short isoforms of thousands of alternatively polyadenylated genes to escape repression or degradation by miRNAs. These escaping events did not happen by chance and are evolutional selected. Further analyses indicate that escaping isoforms prefer to be minor forms and newly evolved. 5)We presented dbRES, the first database containing kinds of up-to-date experimentally reported RNA editing sites. 6)We reported that abundant RNA editing which occurs in 3'-UTR can affect co-expression microRNA targeting. And from point of view of base substitution type or editing efficiency,the influence difference can be found. Key words: microRNA (miRNA); identification; post-transcriptional regulation; alternative polyadenylation; RNA editing