R/top_markers.R
top_markers_glm.Rdcalculate group mean score using glm and order genes based on scores difference
top_markers_glm(
data,
label,
n = 10,
family = gaussian(),
batch = NULL,
scale = TRUE,
use.mgm = TRUE,
pooled.sd = FALSE,
softmax = TRUE,
tau = 1
)matrix, features in row and samples in column
a vector of group label
integer, number of returned top genes for each group
family for glm, details in stats::glm()
a vector of batch labels, default NULL
logical, if to scale data by row
logical, if to scale data using scale_mgm()
logical, if to use pooled SD for scaling
logical, if to apply softmax transformation on output
numeric, hyper parameter for softmax
a tibble with feature names, group labels and ordered processed scores
When family is gaussian() with the identity link (the default) and
the design matrix is full-rank, top_markers_glm() computes all per-
gene label coefficients in a single closed-form least-squares solve
via Matrix::solve(crossprod(X), crossprod(X, t(data))), avoiding
the per-gene glm() loop. For any other family, or a rank-deficient
design, the function automatically falls back to the legacy
apply(data, 1, glm(...)) path, so results are unchanged for users
who pass e.g. family = Gamma() or family = poisson().
data <- matrix(rgamma(100, 2), 10, dimnames = list(1:10))
top_markers_glm(data, label = rep(c("A", "B"), 5))
#> # A tibble: 20 × 3
#> # Groups: .dot [2]
#> .dot Genes Scores
#> <chr> <chr> <dbl>
#> 1 A 6 0.252
#> 2 A 5 0.137
#> 3 A 2 0.116
#> 4 A 10 0.111
#> 5 A 8 0.105
#> 6 A 1 0.0939
#> 7 A 7 0.0666
#> 8 A 3 0.0453
#> 9 A 9 0.0382
#> 10 A 4 0.0353
#> 11 B 4 0.201
#> 12 B 9 0.186
#> 13 B 3 0.157
#> 14 B 7 0.107
#> 15 B 1 0.0757
#> 16 B 8 0.0675
#> 17 B 10 0.0643
#> 18 B 2 0.0610
#> 19 B 5 0.0520
#> 20 B 6 0.0283