NanoCAGE-XL and CapFilter: an approach to genome wide identification of high confidence transcription start sites
Publication Online:
http://www.biomedcentral.com/1471-2164/16/597
About this study:
Identifying the transcription start sites (TSS) of genes is essential for characterizing promoter regions. Several protocols have been developed to capture the 5' end of transcripts via Cap Analysis of Gene Expression (CAGE) or linker-ligation strategies such as Paired-End Analysis of Transcription Start Sites (PEAT), but often require large amounts of tissue. More recently, nanoCAGE was developed for sequencing on the Illumina GAIIx to overcome these difficulties. Here we present the first publicly available adaptation of nanoCAGE for sequencing on recent ultra-high throughput platforms such as Illumina HiSeq-2000, and CapFilter, a computational pipeline that greatly increases confidence in TSS identification. We report excellent gene coverage, reproducibility, and precision in transcription start site discovery for samples from Arabidopsis thaliana roots. nanoCAGE-XL together with CapFilter allows for genome wide identification of high confidence transcription start sites in large eukaryotic genomes.
Protocol
Supplementary Materials
Software
Supplementary Figures
- nanoCAGE-XL_Additional_datafile_2.pdf: Figure S1: Effect of rRNA depletion on nanoCAGE librariy profile. Panel a: Library constructed with total RNA as template. Panel b: Library constructed with Ribo-Zero depleted RNA as template. Figure S2: Examples of nanoCAGE TSS peak distribution before and after G'-filtering for experiments 2 and 3. Figure S3: Examples of CAGE TSS peak distributions before and after G'-filtering.
Supplementary Tables
- nanoCAGE-XL_Additional_datafile_3.xlsx: Table S1: Percent of reads with mismatch at the 1st base (% X1) and 2nd base (% X2).
- nanoCAGE-XL_Additional_datafile_4.xlsx: Table S2: Summary of Peak Locations when applying G' Filtering at a 50% Cutoff.
- nanoCAGE-XL_Additional_datafile_5.xlsx: Table S3: Spearman's Rho for pairwise comparisons of peak RPMs.
Citation
If any of these tools or materials are used for work which results in a publication, we would appreciate citation of the following article:
Cumbie JS, Ivanchenko MG, and Megraw M. (2015). NanoCAGE-XL and CapFilter: an approach to genome wide identification of high confidence transcription start sites, BMC Genomics, 16:597.