TY - JOUR
T1 - MACE
T2 - model based analysis of ChIP-exo
AU - Wang, Liguo
AU - Chen, Junsheng
AU - Wang, Chen
AU - Uusküla-Reimand, Liis
AU - Chen, Kaifu
AU - Medina-Rivera, Alejandra
AU - Young, Edwin J.
AU - Zimmermann, Michael T.
AU - Yan, Huihuang
AU - Sun, Zhifu
AU - Zhang, Yuji
AU - Wu, Stephen T.
AU - Huang, Haojie
AU - Wilson, Michael D.
AU - Kocher, Jean Pierre A
AU - Li, Wei
N1 - Publisher Copyright:
© The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
PY - 2014/11/10
Y1 - 2014/11/10
N2 - Understanding the role of a given transcription factor (TF) in regulating gene expression requires precise mapping of its binding sites in the genome. Chromatin immunoprecipitation-exo, an emerging technique using λ exonuclease to digest TF unbound DNA after ChIP, is designed to reveal transcription factor binding site (TFBS) boundaries with near-single nucleotide resolution. Although ChIP-exo promises deeper insights into transcription regulation, no dedicated bioinformatics tool exists to leverage its advantages. Most ChIP-seq and ChIP-chip analytic methods are not tailored for ChIP-exo, and thus cannot take full advantage of high-resolution ChIP-exo data. Here we describe a novel analysis framework, termed MACE (model-based analysis of ChIP-exo) dedicated to ChIP-exo data analysis. The MACE workflow consists of four steps: (i) sequencing data normalization and bias correction; (ii) signal consolidation and noise reduction; (iii) single-nucleotide resolution border peak detection using the Chebyshev Inequality and (iv) border matching using the Gale-Shapley stable matching algorithm. When applied to published human CTCF, yeast Reb1 and our own mouse ONECUT1/HNF6 ChIP-exo data, MACE is able to define TFBSs with high sensitivity, specificity and spatial resolution, as evidenced by multiple criteria including motif enrichment, sequence conservation, direct sequence pileup, nucleosome positioning and open chromatin states. In addition, we show that the fundamental advance of MACE is the identification of two boundaries of a TFBS with high resolution, whereas other methods only report a single location of the same event. The two boundaries help elucidate the in vivo binding structure of a given TF, e.g. whether the TF may bind as dimers or in a complex with other co-factors.
AB - Understanding the role of a given transcription factor (TF) in regulating gene expression requires precise mapping of its binding sites in the genome. Chromatin immunoprecipitation-exo, an emerging technique using λ exonuclease to digest TF unbound DNA after ChIP, is designed to reveal transcription factor binding site (TFBS) boundaries with near-single nucleotide resolution. Although ChIP-exo promises deeper insights into transcription regulation, no dedicated bioinformatics tool exists to leverage its advantages. Most ChIP-seq and ChIP-chip analytic methods are not tailored for ChIP-exo, and thus cannot take full advantage of high-resolution ChIP-exo data. Here we describe a novel analysis framework, termed MACE (model-based analysis of ChIP-exo) dedicated to ChIP-exo data analysis. The MACE workflow consists of four steps: (i) sequencing data normalization and bias correction; (ii) signal consolidation and noise reduction; (iii) single-nucleotide resolution border peak detection using the Chebyshev Inequality and (iv) border matching using the Gale-Shapley stable matching algorithm. When applied to published human CTCF, yeast Reb1 and our own mouse ONECUT1/HNF6 ChIP-exo data, MACE is able to define TFBSs with high sensitivity, specificity and spatial resolution, as evidenced by multiple criteria including motif enrichment, sequence conservation, direct sequence pileup, nucleosome positioning and open chromatin states. In addition, we show that the fundamental advance of MACE is the identification of two boundaries of a TFBS with high resolution, whereas other methods only report a single location of the same event. The two boundaries help elucidate the in vivo binding structure of a given TF, e.g. whether the TF may bind as dimers or in a complex with other co-factors.
UR - http://www.scopus.com/inward/record.url?scp=84926099280&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84926099280&partnerID=8YFLogxK
U2 - 10.1093/nar/gku846
DO - 10.1093/nar/gku846
M3 - Article
C2 - 25249628
AN - SCOPUS:84926099280
SN - 0305-1048
VL - 42
SP - e156
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - 20
ER -