High-level description
This code defines abetatree class that simulates a beta coalescent tree, a model used in population genetics to represent the ancestral relationships of a sample of individuals. The class allows for the generation of trees under different coalescent models, including the Kingman coalescent (α=2) and the Bolthausen-Sznitman coalescent (α=1).
Code Structure
Thebetatree class contains methods for initializing the tree structure, simulating coalescence events, calculating waiting times between mergers, determining the number of lineages involved in each merger, and cleaning up the tree structure for visualization. The __main__ section provides examples of how to use the class to generate and visualize trees under different coalescent models.
Symbols
betatree
Description
This class simulates a beta coalescent tree. It provides methods for initializing the tree, simulating coalescence events, and visualizing the resulting tree.Inputs
| Name | Type | Description |
|---|---|---|
| sample_size | int | The number of leaves in the tree, representing the sample size. |
| alpha | float | The parameter of the beta coalescent model. Defaults to 2 (Kingman coalescent). |
Outputs
The class does not directly return any values. It modifies its internal state to represent the simulated tree.Internal Logic
Thecoalesce method drives the simulation process:
- Initialization: Creates a list of
Cladeobjects, each representing a single lineage. - Coalescence loop: Iteratively performs coalescence events until a single root lineage remains.
waiting_time: Calculates the time until the next coalescence event based on the current number of lineages and thealphaparameter.whichp: Determines the number of lineages involved in the next merger event.coalescence_event: Selects lineages to merge, updates branch lengths, and creates a new parentClade.
- Cleanup: Adjusts branch lengths to reflect coalescence times and calculates the number of descendants for each node.
Side Effects
Thecoalesce method modifies the internal state of the betatree object, constructing a tree structure represented by interconnected Clade objects.
init_tree
Description
Initializes the tree structure by creating a list ofClade objects, each representing a single leaf node.
Inputs
NoneOutputs
NoneInternal Logic
Creates a list ofPhylo.BaseTree.Clade objects, each with a unique name and an initial branch length of 0.
coalescence_event
Description
Simulates a single coalescence event, merging a random subset of lineages.Inputs
NoneOutputs
NoneInternal Logic
whichp: Determines the number of lineages to merge.waiting_time: Calculates the time elapsed until this event.- Updates the branch lengths of all existing lineages by the waiting time.
- Randomly selects lineages to merge.
merge_clades: Creates a new parentCladerepresenting the merged lineages.
merge_clades
Description
Merges a given set of lineages into a new parentClade.
Inputs
| Name | Type | Description |
|---|---|---|
| merging_blocks | list | A list of indices representing the lineages to merge. |
Outputs
NoneInternal Logic
- Creates a new
Phylo.BaseTree.Cladeobject. - Assigns the selected lineages as children of the new
Clade. - Sets the branch length of the new
Cladeto the current branch length of its children. - Removes the merged lineages from the list of active lineages.
- Adds the new parent
Cladeto the list of active lineages.
clean_up_subtree
Description
Recursively traverses the tree structure to calculate branch lengths and the number of descendants for each node.Inputs
| Name | Type | Description |
|---|---|---|
| clade | Phylo.BaseTree.Clade | The current Clade being processed. |
Outputs
NoneInternal Logic
- If the
Cladeis a leaf node, sets its weight to 1 and returns. - If the
Cladeis an internal node:- Subtracts the branch length of its first child from its own branch length to avoid double-counting.
- Recursively calls
clean_up_subtreefor each child node. - Calculates its own weight as the sum of its children’s weights.
waiting_time
Description
Calculates the waiting time until the next coalescence event based on the current number of lineages and thealpha parameter.
Inputs
NoneOutputs
| Name | Type | Description |
|---|---|---|
| dt | float | The waiting time until the next coalescence event. |
Internal Logic
The waiting time is drawn from an exponential distribution with a rate parameter determined by the coalescent model:- Kingman coalescent (α=2): Rate = 2 / (b * (b - 1)), where b is the number of lineages.
- Bolthausen-Sznitman coalescent (α=1): Rate = 1 / (b - 1).
- General beta coalescent (1 < α < 2): Rate is calculated using gamma functions and the
normalizerprecomputed during initialization.
whichp
Description
Determines the number of lineages involved in the next merger event based on the coalescent model.Inputs
| Name | Type | Description |
|---|---|---|
| b | int | The current number of lineages. |
Outputs
| Name | Type | Description |
|---|---|---|
| int | The number of lineages to merge in the next event. |
Internal Logic
- Kingman coalescent (α=2): Always merges 2 lineages.
- Bolthausen-Sznitman coalescent (α=1): Samples the merger size from a distribution proportional to 1 / (k * (k - 1)).
- General beta coalescent (1 < α < 2): Samples the merger size from a distribution calculated using gamma functions and precomputed values.
Dependencies
| Dependency | Purpose |
|---|---|
| random | Used for random number generation. |
| numpy | Used for numerical operations and array manipulation. |
| scipy.special | Used for the gamma function. |
| Bio.Phylo | Used for creating and manipulating phylogenetic trees. |
