Title Comprehensive understanding of Tn5 insertion preference improves transcription regulatory element identification
Authors Zhang, Houyu
Lu, Ting
Liu, Shan
Yang, Jianyu
Sun, Guohuan
Cheng, Tao
Xu, Jin
Chen, Fangyao
Yen, Kuangyu
Affiliation Chinese Acad Med Sci & Peking Union Med Coll, Natl Clin Res Ctr Blood Dis, State Key Lab Expt Hematol, Inst Hematol & Blood Dis Hosp, Tianjin 300020, Peoples R China
South China Univ Technol, Sch Biol & Biol Engn, Guangzhou 510006, Peoples R China
Southern Med Univ, Sch Basic Med Sci, Dept Dev Biol, Guangzhou 510515, Peoples R China
South China Univ Technol, Sch Med, Div Cell Dev & Integrat Biol, Guangzhou 510006, Peoples R China
Xi An Jiao Tong Univ, Hlth Sci Ctr, Sch Publ Hlth, Dept Epidemiol & Biostat, Xian 710061, Shaanxi, Peoples R China
Peking Univ, Acad Adv Interdisciplinary Studies, Chinese Inst Brain Res, Beijing 100871, Peoples R China
Penn State Univ, Bioinformat & Genom Intercoll Grad Program, University Pk, PA 16802 USA
Keywords DNA SHAPE
READ ALIGNMENT
BINDING SITES
CHROMATIN
SEQ
ACCESSIBILITY
SPECIFICITY
TRANSPOSITION
TAGMENTATION
PREDICTION
Issue Date Dec-2021
Publisher NAR GENOMICS AND BIOINFORMATICS
Abstract Tn5 transposase, which can efficiently tagment the genome, has been widely adopted as a molecular tool in next-generation sequencing, from short-read sequencing to more complex methods such as assay for transposase-accessible chromatin using sequencing (ATAC-seq). Here, we systematically map Tn5 insertion characteristics across several model organisms, finding critical parameters that affect its insertion. On naked genomic DNA, we found that Tn5 insertion is not uniformly distributed or random. To uncover drivers of these biases, we used a machine learning framework, which revealed that DNA shape cooperatively works with DNA motif to affect Tn5 insertion preference. These intrinsic insertion preferences can be modeled using nucleotide dependence information from DNA sequences, and we developed a computational pipeline to correct for these biases in ATAC-seq data. Using our pipeline, we show that bias correction improves the overall performance of ATAC-seq peak detection, recovering many potential false-negative peaks. Furthermore, we found that these peaks are bound by transcription factors, underscoring the biological relevance of capturing this additional information. These findings highlight the benefits of an improved understanding and precise correction of Tn5 insertion preference.
URI http://hdl.handle.net/20.500.11897/637107
DOI 10.1093/nargab/lqab094
Indexed ESCI
Appears in Collections: 前沿交叉学科研究院

Files in This Work
There are no files associated with this item.

Web of Science®


0

Checked on Last Week

Scopus®



Checked on Current Time

百度学术™


0

Checked on Current Time

Google Scholar™





License: See PKU IR operational policies.