standard inverse cell frequency

idf(expr, features = NULL, thres = 0)

Arguments

expr

a matrix, features in row and cells in column

features

vector, feature names or indexes to compute

thres

numeric, cell only counts when expr > threshold, default 0

Value

a vector of inverse cell frequency score for each feature

Details

$$\mathbf{IDF_i} = log(1+\frac{n}{n_i+1})$$ where \(n\) is the total number of cells, \(n_i\) is the number of cells containing feature i.

Examples

data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::idf(data)
#>         1         2         3         4         5         6         7         8 
#> 0.7472144 0.6466272 0.8109302 0.6466272 0.7472144 0.6931472 0.6931472 0.8873032 
#>         9        10 
#> 0.6466272 0.7472144