R/genesetGroups.R
findMsigClusters.Rd
This function identifies gene-set clusters from a gene-set overlap network
produced using vissE. Various graph clustering algorithms from the igraph
package can be used for clustering. Gene-set clusters identified are then
sorted based on their size and a given statistic of interest (absolute of the
statistic is maximised per cluster).
findMsigClusters(
ig,
genesetStat = NULL,
minSize = 2,
alg = igraph::cluster_walktrap,
algparams = list()
)
an igraph object, containing a network of gene set overlaps computed
using computeMsigNetwork()
.
a named numeric, containing statistics for each gene-set that are to be used in cluster prioritisation. If NULL, clusters are prioritised based on their size (number of gene-sets in them).
a numeric, stating the minimum size a cluster can be (default is 2).
a function, from the igraph
package that should be used to
perform graph-clustering (default is igraph::cluster_walktrap
). The
function should produce a communities
object.
a list, specifying additional parameters that are to be passed to the graph clustering algorithm.
a list, containing gene-sets that belong to each cluster. Items in the list are organised based on prioritisation.
Gene-sets clusters are identified using graph clustering and are prioritised based on a combination of cluster size and optionally, a statistic of interest (e.g., enrichment scores). A product-of-ranks approach is used to prioritise clusters when gene-set statistics are available. In this approach, clusters are ranked based on their cluster size (largest to smallest) and on the median absolute statistic of gene-sets within it (largest to smallest). The product of these ranks is computed and clusters are ranked based on these product-of-rank statistic (smallest to largest).
When prioritising using cluster size and gene-set statistics, if statistics for some gene-sets in the network are missing, only the size is used in cluster prioritisation.
data(hgsc)
ovlap <- computeMsigOverlap(hgsc, thresh = 0.25)
ig <- computeMsigNetwork(ovlap, hgsc)
findMsigClusters(ig)
#> $`1`
#> [1] "HALLMARK_BILE_ACID_METABOLISM" "HALLMARK_PEROXISOME"
#>
#> $`2`
#> [1] "HALLMARK_ESTROGEN_RESPONSE_EARLY" "HALLMARK_ESTROGEN_RESPONSE_LATE"
#>
#> $`3`
#> [1] "HALLMARK_COAGULATION" "HALLMARK_COMPLEMENT"
#>
#> $`4`
#> [1] "HALLMARK_GLYCOLYSIS" "HALLMARK_HYPOXIA"
#>
#> $`5`
#> [1] "HALLMARK_E2F_TARGETS" "HALLMARK_G2M_CHECKPOINT"
#>
#> $`6`
#> [1] "HALLMARK_INTERFERON_ALPHA_RESPONSE" "HALLMARK_INTERFERON_GAMMA_RESPONSE"
#>