Difference between revisions of "GTRD Workflow"
Ivan Yevshin (Talk | contribs) |
Ivan Yevshin (Talk | contribs) |
||
Line 1: | Line 1: | ||
+ | [[File:gtrd-workflow.png|right]] | ||
ChIP-seq experiment information were collected in semi-automated way from literature, [http://www.ncbi.nlm.nih.gov/geo/ GEO] and [http://genome.ucsc.edu/ENCODE/ ENCODE]. | ChIP-seq experiment information were collected in semi-automated way from literature, [http://www.ncbi.nlm.nih.gov/geo/ GEO] and [http://genome.ucsc.edu/ENCODE/ ENCODE]. | ||
Line 10: | Line 11: | ||
[https://groups.csail.mit.edu/cgs/gem/ GEM] <ref> Yuchun Guo, Shaun Mahony & David K Gifford. High Resolution Genome Wide Binding Event Finding and Motif Discovery Reveals Transcription Factor Spatial Binding Constraints. PLoS Computational Biology 8(8): e1002638. 2012.</ref> | [https://groups.csail.mit.edu/cgs/gem/ GEM] <ref> Yuchun Guo, Shaun Mahony & David K Gifford. High Resolution Genome Wide Binding Event Finding and Motif Discovery Reveals Transcription Factor Spatial Binding Constraints. PLoS Computational Biology 8(8): e1002638. 2012.</ref> | ||
and [http://www.bioconductor.org/packages/release/bioc/html/PICS.html PICS] <ref>Zhang X, Robertson G, Krzywinski M, Ning K, Droit A, Jones S and Gottardo R. “PICS: Probabilistic Inference for ChIP-seq.” Biometrics, 66. 2010.</ref>. | and [http://www.bioconductor.org/packages/release/bioc/html/PICS.html PICS] <ref>Zhang X, Robertson G, Krzywinski M, Ning K, Droit A, Jones S and Gottardo R. “PICS: Probabilistic Inference for ChIP-seq.” Biometrics, 66. 2010.</ref>. | ||
− | |||
− | |||
==Bowtie2== | ==Bowtie2== |
Revision as of 17:24, 1 July 2016
ChIP-seq experiment information were collected in semi-automated way from literature, GEO and ENCODE.
Raw ChIP-seq data in the form of fastq and SRA files were fetched from ENCODE and SRA databases.
Sequenced reads were aligned using Bowtie2 [1] aligner.
ChIP-seq peaks were called using 4 different methods: MACS [2] SISSRS [3] GEM [4] and PICS [5].
Contents |
Bowtie2
We use bowtie2 version 2.2.3 for ChIP-seq read alignment to the reference genomes of human (GRCh38) and mouse (GRCm38).
Bowtie2 was run with following parameters:
bowtie2 -x $genome -U $fastq_files -p 8 --mm --seed 0
The resulting alignments were converted to bam files, then sorted and indexed using samtools version 1.0
MACS
MACS version 1.4.2 was used for peak calling with following parameters:
macs14 f BAM -g $species -n $peaks -t $alignment_bam
or if control experiment was available:
macs14 f BAM -g $species -n $peaks -t $alignment_bam -c $control_bam
SISSRS
SISSRS requires alignments in bed format, bam files were converted to bed files using bedtools version 2 by:
bamToBed -i $input_bam > $output_bed
Version 1.4 of SISSRS were used for peaks calling with following parameters:
sissrs.pl -i $alignment_bed -s 3000000000 -o $peaks.sissrs
or if control experiment was available:
sissrs.pl -i $alignment_bed -s 3000000000 -o $peaks.sissrs -b $control_bed
GEM
GEM version 2.5 was used with following parameters:
java -Xmx4G -XX:+UseSerialGC -jar /srv/local-main/tools/gem/gem.jar --d /srv/local-main/tools/gem/Read_Distribution_default.txt
--g /srv/local-main/tools/gem/$species.chrom.sizes --s 2000000000 --f SAM --t 1 --out $peaks --expt $bam
or if control experiment was available:
java -Xmx4G -XX:+UseSerialGC -jar /srv/local-main/tools/gem/gem.jar --d /srv/local-main/tools/gem/Read_Distribution_default.txt
--g /srv/local-main/tools/gem/$species.chrom.sizes --s 2000000000 --f SAM --t 1 --out $peaks --expt $bam --ctrl $control
For the large datasets -Xmx24G
parameter was set.
PICS
For peak calling with PICS method we use R version 3.2.0 and PICS version 2.12.0. We use the following custom R script:
References
- ↑ Langmead B, Salzberg S. Fast gapped-read alignment with Bowtie 2. Nature Methods. 2012, 9:357-359.
- ↑ Zhang et al. Model-based Analysis of ChIP-Seq (MACS). Genome Biol (2008) vol. 9 (9) pp. R137
- ↑ Leelavati Narlikar, Raja Jothi. ChIP-Seq data analysis: identification of protein-DNA binding sites with SISSRs peak-finder. Methods in Molecular Biology, 802:305-22, 2012.
- ↑ Yuchun Guo, Shaun Mahony & David K Gifford. High Resolution Genome Wide Binding Event Finding and Motif Discovery Reveals Transcription Factor Spatial Binding Constraints. PLoS Computational Biology 8(8): e1002638. 2012.
- ↑ Zhang X, Robertson G, Krzywinski M, Ning K, Droit A, Jones S and Gottardo R. “PICS: Probabilistic Inference for ChIP-seq.” Biometrics, 66. 2010.