labeled inverse cell frequency: probability based
idf_prob(expr, features = NULL, label, multi = TRUE, thres = 0)
a matrix of inverse cell frequency score
$$\mathbf{IDF_{i,j}} = log(1+\frac{\frac{n_{i,j\in D}}{n_{j\in D}}}{max(\frac{n_{i,j\in \hat D}}{n_{j\in \hat D}})+ e^{-8}}\frac{n_{i,j\in D}}{n_{j\in D}})$$ where \(n_{i,j\in D}\) is the number of cells containing feature \(i\) in class \(D\), \(n_{j\in D}\) is the total number of cells in class \(D\), \(\hat D\) is the class except \(D\).
data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::idf_prob(data, label = sample(c("A", "B"), 10, replace = TRUE))
#> B B B B B A B
#> 1 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472 0.69314718 0.6931472
#> 2 0.5273549 0.5273549 0.5273549 0.5273549 0.5273549 0.78845735 0.5273549
#> 3 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472 0.69314718 0.6931472
#> 4 0.5273549 0.5273549 0.5273549 0.5273549 0.5273549 0.78845735 0.5273549
#> 5 0.6554068 0.6554068 0.6554068 0.6554068 0.6554068 0.51581316 0.6554068
#> 6 0.5273549 0.5273549 0.5273549 0.5273549 0.5273549 0.78845735 0.5273549
#> 7 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472 0.69314718 0.6931472
#> 8 1.3291359 1.3291359 1.3291359 1.3291359 1.3291359 0.07232066 1.3291359
#> 9 1.0986123 1.0986123 1.0986123 1.0986123 1.0986123 0.22314355 1.0986123
#> 10 0.6931472 0.6931472 0.6931472 0.6931472 0.6931472 0.69314718 0.6931472
#> A A A
#> 1 0.69314718 0.69314718 0.69314718
#> 2 0.78845735 0.78845735 0.78845735
#> 3 0.69314718 0.69314718 0.69314718
#> 4 0.78845735 0.78845735 0.78845735
#> 5 0.51581316 0.51581316 0.51581316
#> 6 0.78845735 0.78845735 0.78845735
#> 7 0.69314718 0.69314718 0.69314718
#> 8 0.07232066 0.07232066 0.07232066
#> 9 0.22314355 0.22314355 0.22314355
#> 10 0.69314718 0.69314718 0.69314718