labeled inverse cell frequency: probability based
idf_prob(expr, features = NULL, label, multi = TRUE, thres = 0)a matrix of inverse cell frequency score
$$\mathbf{IDF_{i,j}} = log(1+\frac{\frac{n_{i,j\in D}}{n_{j\in D}}}{max(\frac{n_{i,j\in \hat D}}{n_{j\in \hat D}})+ e^{-8}}\frac{n_{i,j\in D}}{n_{j\in D}})$$ where \(n_{i,j\in D}\) is the number of cells containing feature \(i\) in class \(D\), \(n_{j\in D}\) is the total number of cells in class \(D\), \(\hat D\) is the class except \(D\).
data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::idf_prob(data, label = sample(c("A", "B"), 10, replace = TRUE))
#> A A A A A A A
#> 1 0.9287132 0.9287132 0.9287132 0.9287132 0.9287132 0.9287132 0.9287132
#> 2 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472
#> 3 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472
#> 4 0.5685047 0.5685047 0.5685047 0.5685047 0.5685047 0.5685047 0.5685047
#> 5 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472
#> 6 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472
#> 7 0.4462871 0.4462871 0.4462871 0.4462871 0.4462871 0.4462871 0.4462871
#> 8 0.5685047 0.5685047 0.5685047 0.5685047 0.5685047 0.5685047 0.5685047
#> 9 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472
#> 10 0.5773154 0.5773154 0.5773154 0.5773154 0.5773154 0.5773154 0.5773154
#> A B B
#> 1 0.9287132 0.2513144 0.2513144
#> 2 0.6931472 0.6931472 0.6931472
#> 3 0.6931472 0.6931472 0.6931472
#> 4 0.5685047 0.7621400 0.7621400
#> 5 0.6931472 0.6931472 0.6931472
#> 6 0.6931472 0.6931472 0.6931472
#> 7 0.4462871 0.8472979 0.8472979
#> 8 0.5685047 0.7621400 0.7621400
#> 9 0.6931472 0.6931472 0.6931472
#> 10 0.5773154 0.3364722 0.3364722