Momiao Xiong, Dan Xie, Pengfei Hu and Zheng Hou
A deeper understanding of positive selection is of fundamental importance. Evolutionary pressures acting on the genome as a whole and the specific role of environmental pressures on divergence shape the natural selection. However, despite the intense interest in genome-wide scans, a coherent study of recent human evolutionary history has yet to emerge due to the limitations of data, models, and analytical tools. Fast and cheaper Next Generation Sequencing (NGS) technologies will generate unprecedentedly massive and highly-dimensional genomic variation data. NGS will offer unprecedented opportunities for population genetic and natural selection studies, but also raise great challenges. In the conventional population genetics, most researches have primarily focused on natural selection acting on a single locus. Little attention has been paid to determining how the natural selection acts on multiple interacted genes in response to environmental perturbation. In addition, the current paradigm for analysis of natural selection on gene-expression is to use a single value of summarizing statistic to represent gene expression level and overlook all information on expression difference in exons, genomic position and alleles. To address these critical limitations, we develop a unified framework and statistical methods for genome-wide scans for natural selection and investigation of natural selection on pathways. To explore observed expression variation in exons or in genomic position across the genes, we extend one dimensional diffusion process to multidimensional diffusion process and a single variate stochastic differential equation to multivariate stochastic differential equations. We then use the extended multidimensional diffusion processes to model the evolution of gene-expression acted by natural selection. We hope that the present new development of natural selection analysis for NGS data will open a new avenue for natural selection analysis with NGS data.
分享此文章