Title | GenTree, an integrated resource for analyzing the evolution and function of primate-specific coding genes |
Authors | Shao, Yi Chen, Chunyan Shen, Hao He, Bin Z. Yu, Daqi Jiang, Shuai Zhao, Shilei Gao, Zhiqiang Zhu, Zhenglin Chen, Xi Fu, Yan Chen, Hua Gao, Ge Long, Manyuan Zhang, Yong E. |
Affiliation | Chinese Acad Sci, Inst Zool, Key Lab Zool Systemat & Evolut, Beijing 100101, Peoples R China Chinese Acad Sci, State Key Lab Integrated Management Pest Insects, Inst Zool, Beijing 100101, Peoples R China Univ Chinese Acad Sci, Beijing 100049, Peoples R China Hunan Univ Technol, Coll Comp, Zhuzhou 412007, Hunan, Peoples R China Harvard Univ, FAS Ctr Syst Biol, Cambridge, MA 02138 USA Harvard Univ, Howard Hughes Med Inst, Cambridge, MA 02138 USA Peking Univ, Ctr Bioinformat, Sch Life Sci, State Key Lab Prot & Plant Gene Res, Beijing 100871, Peoples R China Peking Univ, Beijing Adv Innovat Ctr Genom ICG, Biomed Pioneering Innovat Ctr BIOPIC, Beijing 100871, Peoples R China Chinese Acad Sci, Beijing Inst Genom, CAS Key Lab Genom & Precis Med, Beijing 100101, Peoples R China Chinese Acad Sci, Acad Math & Syst Sci, Natl Ctr Math & Interdisciplinary Sci, Key Lab Random Complex Struct & Data Sci, Beijing 100190, Peoples R China Chongqing Univ, Sch Life Sci, Chongqing 400044, Peoples R China Wuhan Inst Biotechnol, Wuhan 430072, Hubei, Peoples R China Wuhan Univ, Med Res Inst, Wuhan 430072, Hubei, Peoples R China Chinese Acad Sci, CAS Ctr Excellence Anim Evolut & Genet, Kunming 650223, Yunnan, Peoples R China Univ Chicago, Dept Ecol & Evolut, 940 E 57Th St, Chicago, IL 60637 USA Univ Iowa, Dept Biol, Iowa City, IA 52242 USA |
Issue Date | 2019 |
Publisher | GENOME RESEARCH |
Abstract | The origination of new genes contributes to phenotypic evolution in humans. Two major challenges in the study of new genes are the inference of gene ages and annotation of their protein-coding potential. To tackle these challenges, we created GenTree, an integrated online database that compiles age inferences from three major methods together with functional genomic data for new genes. Genome-wide comparison of the age inference methods revealed that the synteny-based pipeline (SBP) is most suited for recently duplicated genes, whereas the protein-family-based methods are useful for ancient genes. For SBP-dated primate-specific protein-coding genes (PSGs), we performed manual evaluation based on published PSG lists and showed that SBP generated a conservative data set of PSGs by masking less reliable syntenic regions. After assessing the coding potential based on evolutionary constraint and peptide evidence from proteomic data, we curated a list of 254 PSGs with different levels of protein evidence. This list also includes 41 candidate misannotated pseudogenes that encode primate-specific short proteins. Coexpression analysis showed that PSGs are preferentially recruited into organs with rapidly evolving pathways such as spermatogenesis, immune response, mother-fetus interaction, and brain development. For brain development, primate-specific KRAB zinc-finger proteins (KZNFs) are specifically up-regulated in the mid-fetal stage, which may have contributed to the evolution of this critical stage. Altogether, hundreds of PSGs are either recruited to processes under strong selection pressure or to processes supporting an evolving novel organ. |
URI | http://hdl.handle.net/20.500.11897/549562 |
ISSN | 1088-9051 |
DOI | 10.1101/gr.238733.118 |
Indexed | SCI(E) EI |
Appears in Collections: | 生命科学学院 生物医学前沿创新中心 |