R/generatNullGeneric.R
generateNull.Rd
This function generates a number of random gene sets that
have the same number of genes as the scored gene set. It scores each random
gene set and returns a matrix of scores for all samples.
The empirical scores are used to calculate the empirical p-values and plot
the null distribution. The implementation uses BiocParallel::bplapply()
for easy access to parallel backends. Note that one should pass the same
values to the upSet
, downSet
, centerScore
and bidirectional
arguments as what they provide for the simpleScore()
function to generate
a proper null distribution.
generateNull(
upSet,
downSet = NULL,
rankData,
subSamples = NULL,
centerScore = TRUE,
knownDirection = TRUE,
B = 1000,
ncores = 1,
seed = sample.int(1e+06, 1),
useBPPARAM = NULL
)
# S4 method for vector,ANY
generateNull(
upSet,
downSet = NULL,
rankData,
subSamples = NULL,
centerScore = TRUE,
knownDirection = TRUE,
B = 1000,
ncores = 1,
seed = sample.int(1e+06, 1),
useBPPARAM = NULL
)
# S4 method for GeneSet,ANY
generateNull(
upSet,
downSet = NULL,
rankData,
subSamples = NULL,
centerScore = TRUE,
knownDirection = TRUE,
B = 1000,
ncores = 1,
seed = sample.int(1e+06, 1),
useBPPARAM = NULL
)
# S4 method for vector,vector
generateNull(
upSet,
downSet = NULL,
rankData,
subSamples = NULL,
centerScore = TRUE,
knownDirection = TRUE,
B = 1000,
ncores = 1,
seed = sample.int(1e+06, 1),
useBPPARAM = NULL
)
# S4 method for GeneSet,GeneSet
generateNull(
upSet,
downSet = NULL,
rankData,
subSamples = NULL,
centerScore = TRUE,
knownDirection = TRUE,
B = 1000,
ncores = 1,
seed = sample.int(1e+06, 1),
useBPPARAM = NULL
)
A GeneSet object or character vector of gene IDs of up-regulated gene set or a gene set where the nature of genes is not known
A GeneSet object or character vector of gene IDs of down-regulated gene set or NULL where only a single gene set is provided
A matrix object, ranked gene expression matrix data generated
using the rankGenes()
function (make sure this matrix is not modified, see
details)
A vector of sample labels/indices that will be used to subset the rankData matrix. All samples will be scored if not provided
A Boolean, specifying whether scores should be centered
around 0, default as TRUE. Note: scores never centered if knownDirection = FALSE
A boolean, determining whether the gene set should be considered to be directional or not. A gene set is directional if the type of genes in it are known i.e. up- or down-regulated. This should be set to TRUE if the gene set is composed of both up- AND down-regulated genes. Defaults to TRUE. This parameter becomes irrelevant when both upSet(Colc) and downSet(Colc) are provided.
integer, the number of permutation repeats or the number of random gene sets to be generated, default as 1000
integer, the number of CPU cores the function can use
integer, set the seed for randomisation
the backend the function uses, if NULL is provided, the
function uses the default parallel backend which is the first on the list
returned by BiocParallel::registered()
i.e
BiocParallel::registered()[[1]]
for your machine. It can be changed
explicitly by passing a BPPARAM
A matrix of empirical scores for all samples
Post about BiocParallel
browseVignettes("BiocParallel")
ranked <- rankGenes(toy_expr_se)
scoredf <- simpleScore(ranked, upSet = toy_gs_up, downSet = toy_gs_dn)
# find out what backends can be registered on your machine
BiocParallel::registered()
#> $MulticoreParam
#> class: MulticoreParam
#> bpisup: FALSE; bpnworkers: 4; bptasks: 0; bpjobname: BPJOB
#> bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
#> bpRNGseed: ; bptimeout: NA; bpprogressbar: FALSE
#> bpexportglobals: TRUE; bpexportvariables: FALSE; bpforceGC: FALSE
#> bpfallback: TRUE
#> bplogdir: NA
#> bpresultdir: NA
#> cluster type: FORK
#>
#> $SnowParam
#> class: SnowParam
#> bpisup: FALSE; bpnworkers: 4; bptasks: 0; bpjobname: BPJOB
#> bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
#> bpRNGseed: ; bptimeout: NA; bpprogressbar: FALSE
#> bpexportglobals: TRUE; bpexportvariables: TRUE; bpforceGC: FALSE
#> bpfallback: TRUE
#> bplogdir: NA
#> bpresultdir: NA
#> cluster type: SOCK
#>
#> $SerialParam
#> class: SerialParam
#> bpisup: FALSE; bpnworkers: 1; bptasks: 0; bpjobname: BPJOB
#> bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
#> bpRNGseed: ; bptimeout: NA; bpprogressbar: FALSE
#> bpexportglobals: FALSE; bpexportvariables: FALSE; bpforceGC: FALSE
#> bpfallback: FALSE
#> bplogdir: NA
#> bpresultdir: NA
#>
# the first one is the default backend
# ncores = ncores <- parallel::detectCores() - 2
permuteResult = generateNull(upSet = toy_gs_up, downSet = toy_gs_dn, ranked,
centerScore = TRUE, B =10, seed = 1, ncores = 1 )