| windows_pipeline_quantification {FLAMES} | R Documentation |
This is the final step in the 3 step Windows FLAMES pipeline. This should be run
after read realignment is performed, following windows_pipeline_isoforms.
windows_pipeline_quantification(pipeline_vars)
pipeline_vars |
the list returned from |
windows_pipeline_quantification returns a SummarizedExperiment object, or a SingleCellExperiment in the case
of this function being used for the FLAMES single cell pipeline, containing a count
matrix as an assay, gene annotations under metadata, as well as a list of the other
output files generated by the pipeline. The pipeline also outputs a number of output
files into the given outdir directory. These output files generated by the pipeline are:
transcript_count.csv.gz - a transcript count matrix (also contained in the SummarizedExperiment)
isoform_annotated.filtered.gff3 - isoforms in gff3 format (also contained in the SummarizedExperiment)
transcript_assembly.fa - transcript sequence from the isoforms
align2genome.bam - sorted BAM file with reads aligned to genome
realign2transcript.bam - sorted realigned BAM file using the transcript_assembly.fa as reference
tss_tes.bedgraph - TSS TES enrichment for all reads (for QC)
## example windows pipeline for BULK data. See Vignette for single cell data.
# download the two fastq files, move them to a folder to be merged together
temp_path <- tempfile()
bfc <- BiocFileCache::BiocFileCache(temp_path, ask=FALSE)
file_url <-
"https://raw.githubusercontent.com/OliverVoogd/FLAMESData/master/data"
# download the required fastq files, and move them to new folder
fastq1 <- bfc[[names(BiocFileCache::bfcadd(bfc, "Fastq1", paste(file_url, "fastq/sample1.fastq.gz", sep="/")))]]
fastq2 <- bfc[[names(BiocFileCache::bfcadd(bfc, "Fastq2", paste(file_url, "fastq/sample2.fastq.gz", sep="/")))]]
fastq_dir <- paste(temp_path, "fastq_dir", sep="/") # the downloaded fastq files need to be in a directory to be merged together
dir.create(fastq_dir)
file.copy(c(fastq1, fastq2), fastq_dir)
unlink(c(fastq1, fastq2)) # the original files can be deleted
# run the FLAMES bulk pipeline setup
#pipeline_variables <- bulk_windows_pipeline_setup(annot=system.file("extdata/SIRV_anno.gtf", package="FLAMES"),
# fastq=fastq_dir,
# outdir=tempdir(), genome_fa=system.file("extdata/SIRV_genomefa.fasta", package="FLAMES"),
# config_file=system.file("extdata/SIRV_config_default.json", package="FLAMES"))
# read alignment is handled externally (below downloads aligned bam for example)
# genome_bam <- paste0(temp_path, "/align2genome.bam")
# file.rename(bfc[[names(BiocFileCache::bfcadd(bfc, "Genome BAM", paste(file_url, "align2genome.bam", sep="/")))]], genome_bam)
#
# genome_index <- paste0(temp_path, "/align2genome.bam.bai")
# file.rename(bfc[[names(BiocFileCache::bfcadd(bfc, "Genome BAM Index", paste(file_url, "align2genome.bam.bai", sep="/")))]], genome_index)
# pipeline_variables$genome_bam = genome_bam
#
# # run the FLAMES bulk pipeline find isoforms step
# pipeline_variables <- windows_pipeline_isoforms(pipeline_variables)
#
# # read realignment is handled externally
# realign_bam <- paste0(temp_path, "/realign2genome.bam")
# file.rename(bfc[[names(BiocFileCache::bfcadd(bfc, "Realign BAM", paste(file_url, "realign2transcript.bam", sep="/")))]], realign_bam)
#
# realign_index <- paste0(temp_path, "/realign2genome.bam.bai")
# file.rename(bfc[[names(BiocFileCache::bfcadd(bfc, "Realign BAM Index", paste(file_url, "realign2transcript.bam.bai", sep="/")))]], realign_index)
# pipeline_variables$realign_bam <- realign_bam
#
# # finally, quantification, which returns a Summarized Experiment object
# se <- windows_pipeline_quantification(pipeline_variables)