| remove_redundancy,tbl_df-method {tidybulk} | R Documentation |
remove_redundancy
## S4 method for signature 'tbl_df' remove_redundancy( .data, .element = NULL, .feature = NULL, .abundance = NULL, method, of_samples = TRUE, correlation_threshold = 0.9, top = Inf, log_transform = FALSE, Dim_a_column = NULL, Dim_b_column = NULL )
.data |
A 'tbl' formatted as | <SAMPLE> | <TRANSCRIPT> | <COUNT> | <...> | |
.element |
The name of the element column (normally samples). |
.feature |
The name of the feature column (normally transcripts/genes) |
.abundance |
The name of the column including the numerical value the clustering is based on (normally transcript abundance) |
method |
A character string. The cluster algorithm to use, ay the moment k-means is the only algorithm included. |
of_samples |
A boolean. In case the input is a tidybulk object, it indicates Whether the element column will be sample or transcript column |
correlation_threshold |
A real number between 0 and 1. For correlation based calculation. |
top |
An integer. How many top genes to select for correlation based method |
log_transform |
A boolean, whether the value should be log-transformed (e.g., TRUE for RNA sequencing data) |
Dim_a_column |
A character string. For reduced_dimension based calculation. The column of one principal component |
Dim_b_column |
A character string. For reduced_dimension based calculation. The column of another principal component |
A tbl object with with dropped recundant elements (e.g., samples).