E_step
¶
Module Contents¶
-
E_step.
get_leaf_Normalizing_Factors
(leaves_idx: npt.NDArray[np.uintp], MSD: npt.NDArray[np.float64], EL: npt.NDArray[np.float64]) → npt.NDArray[np.float64]¶ Normalizing factor (NF) matrix and base case at the leaves.
Each element in this N by 1 matrix is the normalizing factor for each beta value calculation for each node. This normalizing factor is essentially the marginal observation distribution for a node.
This function gets the normalizing factor for the upward recursion only for the leaves. We first calculate the joint probability using the definition of conditional probability:
\(P(x_n = x | z_n = k) * P(z_n = k) = P(x_n = x , z_n = k)\), where n are the leaf nodes.
We can then sum this joint probability over k, which are the possible states z_n can be, and through the law of total probability, obtain the marginal observation distribution \(P(x_n = x) = sum_k ( P(x_n = x , z_n = k) ) = P(x_n = x)\).
- Parameters
EL – The emissions likelihood
MSD – The marginal state distribution P(z_n = k)
- Returns
normalizing factor. The marginal observation distribution P(x_n = x)
-
E_step.
get_MSD
(cell_to_parent: np.ndarray, pi: npt.NDArray[np.float64], T: npt.NDArray[np.float64]) → npt.NDArray[np.float64]¶ Marginal State Distribution (MSD) matrix by upward recursion. This is the probability that a hidden state variable \(z_n\) is of state k, that is, each value in the N by K MSD array for each lineage is the probability
\(P(z_n = k)\),
for all \(z_n\) in the hidden state tree and for all k in the total number of discrete states. Each MSD array is an N by K array (an entry for each cell and an entry for each state), and each lineage has its own MSD array.
Every element in MSD matrix is essentially sum over all transitions from any state to state j (from parent to daughter):
\(P(z_u = k) = \sum_j(Transition(j -> k) * P(parent_{cell_u}) = j)\)
- Parameters
pi – Initial probabilities vector
T – State transitions matrix
- Returns
The marginal state distribution
-
E_step.
np_apply_along_axis
(func1d, axis, arr)¶
-
E_step.
get_beta
(leaves_idx: npt.NDArray[np.uintp], cell_to_daughters: npt.NDArray[np.intp], T: npt.NDArray[np.float64], MSD: npt.NDArray[np.float64], EL: npt.NDArray[np.float64], NF: npt.NDArray[np.float64]) → npt.NDArray[np.float64]¶ Beta matrix and base case at the leaves.
Each element in this N by K matrix is the beta value for each cell and at each state. In particular, this value is derived from the Marginal State Distributions (MSD), the Emission Likelihoods (EL), and the Normalizing Factors (NF). Each beta value for the leaves is exactly the probability
\(beta[n,k] = P(z_n = k | x_n = x)\).
Using Bayes Theorem, we see that the above equals
numerator = \(P(x_n = x | z_n = k) * P(z_n = k)\) denominator = \(P(x_n = x)\) \(beta[n,k] = numerator / denominator\)
For non-leaf cells, the first value in the numerator is the Emission Likelihoods. The second value in the numerator is the Marginal State Distributions. The value in the denominator is the Normalizing Factor.
Traverses upward through each tree and calculates the beta value for each non-leaf cell. The normalizing factors (NFs) are also calculated as an intermediate for determining each beta term. Helper functions are called to determine one of the terms in the NF equation. This term is also used in the calculation of the betas.
- Parameters
tHMMobj – A class object with properties of the lineages of cells
MSD – The marginal state distribution P(z_n = k)
EL – The emissions likelihood
NF – normalizing factor. The marginal observation distribution P(x_n = x)
- Returns
beta values. The conditional probability of states, given observations of the sub-tree rooted in cell_n
-
E_step.
get_gamma
(cell_to_daughters: npt.NDArray[np.uintp], T: npt.NDArray[np.float64], MSD: npt.NDArray[np.float64], beta: npt.NDArray[np.float64]) → npt.NDArray[np.float64]¶ Get the gammas using downward recursion from the root nodes. The conditional probability of states, given observation of the whole tree P(z_n = k | X_bar = x_bar) x_bar is the observations for the whole tree. gamma_1 (k) = P(z_1 = k | X_bar = x_bar) gamma_n (k) = P(z_n = k | X_bar = x_bar)
- Parameters
MSD – The marginal state distribution P(z_n = k)
betas – beta values. The conditional probability of states, given observations of the sub-tree rooted in cell_n