| GDCprepare {TCGAbiolinks} | R Documentation |
Reads the data downloaded and prepare it into an R object
GDCprepare(query, save = FALSE, save.filename, directory = "GDCdata",
summarizedExperiment = TRUE, remove.files.prepared = FALSE,
add.gistic2.mut = NULL, mut.pipeline = "mutect2",
mutant_variant_classification = c("Frame_Shift_Del", "Frame_Shift_Ins",
"Missense_Mutation", "Nonsense_Mutation", "Splice_Site", "In_Frame_Del",
"In_Frame_Ins", "Translation_Start_Site", "Nonstop_Mutation"))
query |
A query for GDCquery function |
save |
Save result as RData object? |
save.filename |
Name of the file to be save if empty an automatic will be created |
directory |
Directory/Folder where the data was downloaded. Default: GDCdata |
summarizedExperiment |
Create a summarizedExperiment? Default TRUE (if possible) |
remove.files.prepared |
Remove the files read? Default: FALSE This argument will be considered only if save argument is set to true |
add.gistic2.mut |
If a list of genes (gene symbol) is given, columns with gistic2 results from GDAC firehose (hg19) and a column indicating if there is or not mutation in that gene (hg38) (TRUE or FALSE - use the MAF file for more information) will be added to the sample matrix in the summarized Experiment object. |
mut.pipeline |
If add.gistic2.mut is not NULL this field will be taken in consideration. Four separate variant calling pipelines are implemented for GDC data harmonization. Options: muse, varscan2, somaticsniper, MuTect2. For more information: https://gdc-docs.nci.nih.gov/Data/Bioinformatics_Pipelines/DNA_Seq_Variant_Calling_Pipeline/ |
mutant_variant_classification |
List of mutant_variant_classification that will be consider a sample mutant or not. Default: "Frame_Shift_Del", "Frame_Shift_Ins", "Missense_Mutation", "Nonsense_Mutation", "Splice_Site", "In_Frame_Del", "In_Frame_Ins", "Translation_Start_Site", "Nonstop_Mutation" |
A summarizedExperiment or a data.frame
query <- GDCquery(project = "TCGA-KIRP",
data.category = "Simple Nucleotide Variation",
data.type = "Masked Somatic Mutation",
workflow.type = "MuSE Variant Aggregation and Masking")
GDCdownload(query, method = "api", directory = "maf")
maf <- GDCprepare(query, directory = "maf")
query <- GDCquery(project = "TCGA-ACC",
data.category = "Copy number variation",
legacy = TRUE,
file.type = "hg19.seg",
barcode = c("TCGA-OR-A5LR-01A-11D-A29H-01", "TCGA-OR-A5LJ-10A-01D-A29K-01"))
# data will be saved in GDCdata/TCGA-ACC/legacy/Copy_number_variation/Copy_number_segmentation
GDCdownload(query, method = "api")
acc.cnv <- GDCprepare(query)
## Not run:
query <- GDCquery(project = "TCGA-GBM",
legacy = TRUE,
data.category = "Gene expression",
data.type = "Gene expression quantification",
platform = "Illumina HiSeq",
file.type = "normalized_results",
experimental.strategy = "RNA-Seq")
GDCdownload(query, method = "api")
data <- GDCprepare(query,add.gistic2.mut = c("PTEN","FOXJ1"))
## End(Not run)