It estimates and regularizes the genes (or features) dispersion parameter of decal negative binomial model using the strategy developed by Hafemeister & Satija (2019).

estimate_dispersion(count, n = 2000, min_mu = 0.05)

Arguments

count

UMI count matrix with cells as columns and genes (or features) as rows.

n

number of genes sampled to preliminary estimation.

min_mu

minimal overall average expression (mu) required.

Value

a numeric vector of the estimated dispersion for each row of count

Details

First, for a subset of genes it fits a Poisson regression offseted by log(depth) and estimate a crude theta using a maximum likelihood estimator with the observed counts and regression results. Next, it regularize and expands theta estimates with a kernel smoothing function as a function of average count (mu).