Welcome to statOT’s documentation!

statOT package

Core implementation

statot.inference.compute_conditional_mfpt(P, j, sink_idx)

Compute conditional mean first passage time MFPT(x_i -> x_j). Based on implementation in PBA.

Parameters
  • P – transition matrix

  • j – index j of cell x_j

  • sink_idx – boolean array of length N, set to True for sinks and False otherwise.

Returns

vector t_i containing MFPT(x_i -> x_)

statot.inference.compute_fate_probs(P, sink_idx)

Compute fate probabilities by individual sink cell

Parameters
  • P – transition matrix

  • sink_idx – boolean array of length N, set to True for sinks and False otherwise.

Returns

matrix with dimensions (N, S) where S is the number of sink cells present.

statot.inference.compute_fate_probs_lineages(P, sink_idx, labels)

Compute fate probabilities by lineage

Parameters
  • P – transition matrix

  • sink_idx – boolean array of length N, set to True for sinks and False otherwise.

  • labels – string array of length N containing lineage names. Only those entries corresponding to sinks will be used.

Returns

matrix with dimensions (N, L) where L is the number of lineages with sinks.

statot.inference.gaussian_tr(C, h)

Form Gaussian (discrete heat flow) transition matrix of bandwidth h

Parameters
  • C – pairwise square distances

  • h – bandwidth

statot.inference.row_normalise(gamma, sink_idx=None)

Enforce sink condition and row normalise coupling to produce transition matrix

Parameters
  • gamma – coupling produced by statot();

  • sink_idx – boolean array of length N, set to True for sinks and False otherwise. If provided, sets the transition distributions for all sinks to be the identity.

Returns

transition matrix obtained by row-normalising the input gamma.

statot.inference.statot(x, C=None, eps=None, method='ent', g=None, dt=None, maxiter=5000, tol=1e-09, verbose=False)

Fit statOT model

Parameters
  • x – input data – N points of M dimensions in the form of a matrix with dimensions (N, M)

  • C – cost matrix for optimal transport problem

  • eps – regularisation parameter

  • method – choice of regularisation – either “ent” (entropy) or “quad” (L2). “unbal” for unbalanced transport is not yet implemented. if “marginals”, return just mu and nu.

  • g – numeric array of length N, containing the relative growth rates for cells.

  • flow_rate – used only in the growth-free case (flow only)

  • dt – choice of the time step over which to fit the model

  • maxiter – max number of iterations for OT solver

  • tol – relative tolerance for OT solver convergence

  • verbose – detailed output on convergence of OT solver.

Returns

gamma (optimal transport coupling), mu (source measure), nu (target measure)

statot.inference.velocity_from_transition_matrix(P, x, deltat)

Estimate velocity field from transition matrix (i.e. compute expected displacements)

Parameters
  • P – transition matrix

  • x – input data – N points of M dimensions in the form of a matrix with dimensions (N, M)

  • deltat – timestep for which P was calculated.

CellRank wrapper

class statot.cr.OTKernel(adata, g, compute_cond_num=False)

Bases: cellrank.tl.kernels._base_kernel.Kernel

Kernel class allowing statOT method to be used from CellRank. Call first set_terminal_states to specify which cells to use as sinks.

Parameters
  • adataAnnData object containing N cells. We can use any embedding for statOT, selected when calling OTKernel.compute_transition_matrix().

  • g – string specifying the key in adata.obs to a numeric array of length N, containing the relative growth rates for cells, or the array itself.

  • compute_cond_num – set to True to compute the condition number of the transition matrix.

compute_transition_matrix(eps, dt, expr_key='X_pca', cost_norm_method=None, method='ent', tol=1e-09, thresh=0, maxiter=5000, C=None, verbose=False)

Compute transition matrix using StationaryOT.

Parameters
  • eps – regularisation parameter

  • dt – choice of the time step over which to fit the model

  • expr_key – key to embedding to use in adata.obsm.

  • cost_norm_method – cost normalisation method to use. use “mean” to ensure mean(C) = 1, or refer to ot.utils.cost_normalization in Python OT.

  • thresh – threshold for output transition probabilities (no thresholding by default)

  • maxiter – max number of iterations for OT solver

  • C – cost matrix for optimal transport problem

  • verbose – detailed output on convergence of OT solver.

copy()statot.cr.OTKernel

Return a copy of itself. Note that the underlying :paramref:`adata` object is not copied.

statot.cr.set_terminal_states(adata, sink_idx, labels, terminal_colors)

Set user-specified terminal states for CellRank API functions and OTKernel.

Parameters
  • adataAnnData object containing N cells.

  • sink_idx – string specifying the key in adata.uns to a boolean array of length N, set to True for sinks and False otherwise, or the array itself.

  • labels – string array of length N containing lineage names. Only those entries corresponding to sinks will be used.

  • terminal_colors – colors corresponding to terminal state labels.

pyKeOps-numpy implementation

statot.keops.compute_fate_probs(Q, R)

Compute fate probabilities from Q (LazyTensor) and R (np.ndarray)

Parameters
  • Q – transient part of transition matrix from get_QR_submat, as LazyTensor

  • R – absorbing part of transition matrix. Should aggregate the columns across fates, since the solver cannot solve multiple RHS at once.

statot.keops.form_cost(mu_spt, nu_spt, norm_factor=None, keops=True)

Form cost matrix (matrix of squared Euclidean distances)

Parameters
  • mu_spt – support of source measure

  • nu_spt – support of target measure

  • norm_factor – normalisation factor as a float, None or “mean”

  • keops – whether to return a LazyTensor or np.array

statot.keops.get_QR_submat_ent(u, K, v, X, sink_idx, eps, cost_norm_factor)
Compute Q (as LazyTensor) and R (as np.ndarray) matrices for

entropy-regularised OT dual potentials (u, v)

Parameters
  • u – dual potential for source distribution

  • K – Gibbs kernel as LazyTensor

  • v – dual potential for target distribution

  • X – coordinates as np.ndarray

  • sink_idx – boolean array of length N, set to True for sinks and False otherwise.

  • eps – value of eps used for solving with sinkhorn

  • cost_norm_factor – normalisation factor used in form_cost

statot.keops.get_QR_submat_quad(u, C, v, X, sink_idx, eps, cost_norm_factor)
Compute Q (as LazyTensor) and R (as np.ndarray) matrices for

quadratically regularised OT dual potentials (u, v)

Parameters
  • u – dual potential for source distribution

  • C – cost matrix as LazyTensor

  • v – dual potential for target distribution

  • X – coordinates as np.ndarray

  • sink_idx – boolean array of length N, set to True for sinks and False otherwise.

  • eps – value of eps used for solving with quad_ot_semismooth_newton

  • cost_norm_factor – normalisation factor used in form_cost

statot.keops.quad_ot_semismooth_newton(mu, nu, C, eps, max_iter=50, theta=0.1, kappa=0.5, tol=0.001, eta=1e-05, cg_max_iter=500, verbose=False)
Semismooth Newton algorithm for solving quadratically regularised optimal transport

compatible with KeOps LazyTensor framework. Uses the method from Algorithm 2 of Lorenz, D.A., Manns, P. and Meyer, C., 2019. Quadratically regularized optimal transport. Applied Mathematics & Optimization, pp.1-31.

Parameters
  • mu – source distribution

  • nu – target distribution

  • C – cost matrix as LazyTensor

  • max_iter – maximum number of Newton steps

  • theta – Armijo control parameter (choose in \((0, 1)\))

  • kappa – Armijo step scaling parameter (choose in \((0, 1)\))

  • tol – tolerance (inf-norm on marginals)

  • eta – conjugate gradient regularisation parameter

  • cg_max_iter – maximum number of conjugate gradient iterations

  • verbose – flag for verbose output

statot.keops.set_dtype(d)

Set dtype to use in statot.keops

statot.keops.sinkhorn(mu, nu, K, max_iter=5000, err_check=10, tol=1e-09, verbose=False)
Sinkhorn algorithm for solving entropy-regularised optimal transport

compatible with KeOps LazyTensor framework.

Parameters
  • mu – source distribution

  • nu – target distribution

  • K – Gibbs kernel as LazyTensor

  • max_iter – maximum number of iterations

  • err_check – interval for checking marginal error

  • tol – tolerance (inf-norm on marginals)

  • verbose – flag for verbose output

Indices and tables