Identify compounds that may target the phenotype associated with a user-provided differential expression profile by comparing such against a correlation matrix of gene expression and drug sensitivity.

predictTargetingDrugs(
  input,
  expressionDrugSensitivityCor,
  method = c("spearman", "pearson", "gsea"),
  geneSize = 150,
  isDrugActivityDirectlyProportionalToSensitivity = NULL,
  threads = 1,
  chunkGiB = 1,
  verbose = FALSE
)

Arguments

input

Named numeric vector of differentially expressed genes whose names are gene identifiers and respective values are a statistic that represents significance and magnitude of differentially expressed genes (e.g. t-statistics); or character of gene symbols composing a gene set that is tested for enrichment in reference data (only used if method includes gsea)

expressionDrugSensitivityCor

Matrix or character: correlation matrix of gene expression (rows) and drug sensitivity (columns) across cell lines or path to file containing such data; see loadExpressionDrugSensitivityAssociation().

method

Character: comparison method (spearman, pearson or gsea; multiple methods may be selected at once)

geneSize

Numeric: number of top up-/down-regulated genes to use as gene sets to test for enrichment in reference data; if a 2-length numeric vector, the first index is the number of top up-regulated genes and the second index is the number of down-regulated genes used to create gene sets; only used if method includes gsea and if input is not a gene set

isDrugActivityDirectlyProportionalToSensitivity

Boolean: are the values used for drug activity directly proportional to drug sensitivity? If NULL, the argument expressionDrugSensitivityCor must have a non-NULL value for attribute isDrugActivityDirectlyProportionalToSensitivity.

threads

Integer: number of parallel threads

chunkGiB

Numeric: if second argument is a path to an HDF5 file (.h5 extension), that file is loaded and processed in chunks of a given size in gibibytes (GiB); lower values decrease peak RAM usage (see details below)

verbose

Boolean: print additional details?

Value

Data table with correlation and/or GSEA score results

Process data by chunks

If a file path to a valid HDF5 (.h5) file is provided instead of a data matrix, that file can be loaded and processed in chunks of size chunkGiB, resulting in decreased peak memory usage.

The default value of 1 GiB (1 GiB = 1024^3 bytes) allows loading chunks of ~10000 columns and 14000 rows (10000 * 14000 * 8 bytes / 1024^3 = 1.04 GiB).

GSEA score

When method = "gsea", weighted connectivity scores (WTCS) are calculated (https://clue.io/connectopedia/cmap_algorithms).

Examples

# Example of a differential expression profile
data("diffExprStat")

# Load expression and drug sensitivity association derived from GDSC data
gdsc <- loadExpressionDrugSensitivityAssociation("GDSC 7")
#> Loading data from expressionDrugSensitivityCorGDSC7.qs...

# Predict targeting drugs on a differential expression profile
predictTargetingDrugs(diffExprStat, gdsc)
#> Subsetting data based on 11396 intersecting genes (85% of the 13451 input genes)...
#> Comparing against 266 GDSC 7 compounds (983 cell lines) using 'spearman, pearson, gsea' (gene size of 150)...
#> Comparison performed in 2.5 secs
#>      compound spearman_coef spearman_pvalue spearman_qvalue pearson_coef
#>        <char>         <num>           <num>           <num>        <num>
#>   1:     1047    0.14532331    7.908329e-55    2.103616e-52   0.13098933
#>   2:      207    0.10447948    4.950795e-29    2.633823e-27   0.09295384
#>   3:      157    0.12935630    1.020442e-43    1.357188e-41   0.11655136
#>   4:     1091    0.08481582    1.194387e-19    1.512890e-18   0.08245367
#>   5:      110    0.11857369    5.806289e-37    5.148243e-35   0.11346259
#>  ---                                                                    
#> 262:     1133   -0.02817757    2.627306e-03    3.529613e-03  -0.02761751
#> 263:     1149   -0.01812593    5.299930e-02    6.210491e-02  -0.02690099
#> 264:       35   -0.02099097    2.503708e-02    3.069061e-02  -0.03075034
#> 265:     1029   -0.04152497    9.237123e-06    1.565016e-05  -0.04525941
#> 266:     1031   -0.08489029    1.109885e-19    1.476147e-18  -0.08808410
#>      pearson_pvalue pearson_qvalue       GSEA spearman_rank pearson_rank
#>               <num>          <num>      <num>         <num>        <num>
#>   1:   8.577544e-45   2.281627e-42  0.0000000             1            1
#>   2:   2.693140e-23   1.023393e-21  0.2769544             5            7
#>   3:   9.209531e-36   1.224868e-33  0.0000000             2            2
#>   4:   1.184660e-18   2.100798e-17  0.3118949            20           14
#>   5:   5.716381e-34   5.068524e-32  0.0000000             3            3
#>  ---                                                                    
#> 262:   3.193570e-03   4.591836e-03 -0.2605952           263          261
#> 263:   4.079592e-03   5.772189e-03 -0.3444514           259          260
#> 264:   1.026803e-03   1.587963e-03 -0.3038956           261          263
#> 265:   1.341660e-06   2.949435e-06 -0.3298198           264          264
#> 266:   4.491968e-21   1.194863e-19 -0.3800912           266          266
#>      GSEA_rank rankProduct_rank
#>          <num>            <num>
#>   1:     130.5                1
#>   2:       9.0                2
#>   3:     130.5                3
#>   4:       3.0                4
#>   5:     130.5                5
#>  ---                           
#> 262:     257.0              262
#> 263:     265.0              263
#> 264:     262.0              264
#> 265:     264.0              265
#> 266:     266.0              266