Identify compounds that may target the phenotype associated with a user-provided differential expression profile by comparing such against a correlation matrix of gene expression and drug sensitivity.
predictTargetingDrugs(
input,
expressionDrugSensitivityCor,
method = c("spearman", "pearson", "gsea"),
geneSize = 150,
isDrugActivityDirectlyProportionalToSensitivity = NULL,
threads = 1,
chunkGiB = 1,
verbose = FALSE
)
Named numeric vector
of differentially expressed genes
whose names are gene identifiers and respective values are a statistic that
represents significance and magnitude of differentially expressed genes
(e.g. t-statistics); or character
of gene symbols composing a gene
set that is tested for enrichment in reference data (only used if
method
includes gsea
)
Matrix or character: correlation matrix
of gene expression (rows) and drug sensitivity (columns) across cell lines
or path to file containing such data; see
loadExpressionDrugSensitivityAssociation()
.
Character: comparison method (spearman
, pearson
or gsea
; multiple methods may be selected at once)
Numeric: number of top up-/down-regulated genes to use as
gene sets to test for enrichment in reference data; if a 2-length numeric
vector, the first index is the number of top up-regulated genes and the
second index is the number of down-regulated genes used to create gene
sets; only used if method
includes gsea
and if input
is not a gene set
Boolean: are the
values used for drug activity directly proportional to drug sensitivity?
If NULL
, the argument expressionDrugSensitivityCor
must have
a non-NULL
value for attribute
isDrugActivityDirectlyProportionalToSensitivity
.
Integer: number of parallel threads
Numeric: if second argument is a path to an HDF5 file
(.h5
extension), that file is loaded and processed in chunks of a
given size in gibibytes (GiB); lower values decrease peak RAM usage (see
details below)
Boolean: print additional details?
Data table with correlation and/or GSEA score results
If a file path to a valid HDF5 (.h5
) file is provided instead of a
data matrix, that file can be loaded and processed in chunks of size
chunkGiB
, resulting in decreased peak memory usage.
The default value of 1 GiB (1 GiB = 1024^3 bytes) allows loading chunks of ~10000 columns and
14000 rows (10000 * 14000 * 8 bytes / 1024^3 = 1.04 GiB
).
When method = "gsea"
, weighted connectivity scores (WTCS) are
calculated (https://clue.io/connectopedia/cmap_algorithms).
Other functions related with the prediction of targeting drugs:
as.table.referenceComparison()
,
listExpressionDrugSensitivityAssociation()
,
loadExpressionDrugSensitivityAssociation()
,
plot.referenceComparison()
,
plotTargetingDrugsVSsimilarPerturbations()
# Example of a differential expression profile
data("diffExprStat")
# Load expression and drug sensitivity association derived from GDSC data
gdsc <- loadExpressionDrugSensitivityAssociation("GDSC 7")
#> Loading data from expressionDrugSensitivityCorGDSC7.qs...
#> Error in qread(file): Malformed compress block: compressed size > compress bound
# Predict targeting drugs on a differential expression profile
predictTargetingDrugs(diffExprStat, gdsc)
#> Error in predictTargetingDrugs(diffExprStat, gdsc): object 'gdsc' not found