- The following statements were made describing the properties of a UPGMA tree (Unweighted Pair Group Method with Arithmetic Mean):
A. It describes species relationships and is therefore the best method to describe a new species.
B. It is a method of hierarchical clustering.
C. The raw data is a similarity matrix and the initial tree is rooted.
D. It permits lineages with largely different branch lengths and corrections for multiple substitutions.
Which one of the following options represents the correct properties?
(1) A and B (2) B and C
(3) A and D (4) C and DUnderstanding UPGMA Trees: Key Properties and Applications in Phylogenetics
The Unweighted Pair Group Method with Arithmetic Mean (UPGMA) is a foundational algorithm in phylogenetics used to construct evolutionary trees from genetic or phenotypic data. UPGMA is favored for its simplicity and computational efficiency, but it comes with specific assumptions and properties that influence its application and interpretation. Let’s clarify which statements about UPGMA are correct and why.
Key Properties of UPGMA Trees
1. Hierarchical Clustering Method
UPGMA is fundamentally a hierarchical (agglomerative) clustering method. It works by iteratively grouping the two closest clusters (or taxa) based on a similarity or distance matrix, then recalculating distances using the arithmetic mean. This process continues until all taxa are grouped into a single, rooted tree.
2. Similarity Matrix and Rooted Trees
UPGMA starts with a similarity or distance matrix as raw input data. The algorithm constructs a rooted tree (dendrogram), with the root representing the most recent common ancestor of all taxa. The tree is built stepwise, with each new cluster representing a higher-level ancestor.
3. Ultrametric Trees and Molecular Clock Assumption
A defining feature of UPGMA is that it produces ultrametric trees—all terminal nodes (species) are equidistant from the root. This reflects the assumption of a constant rate of evolution (molecular clock) across all lineages, which is often not realistic for all datasets.
4. Limitations
UPGMA does not permit lineages with largely different branch lengths or correct for multiple substitutions. Its strict molecular clock assumption means it can generate inaccurate trees if evolutionary rates vary among lineages.
Evaluating the Statements
Let’s assess the provided statements:
-
A. UPGMA describes species relationships and is therefore the best method to describe a new species.
Incorrect. UPGMA reflects phenotypic or genetic similarity, not necessarily true evolutionary relationships, and is not always the best method for species description, especially if the molecular clock assumption is violated. -
B. It is a method of hierarchical clustering.
Correct. UPGMA is a classic example of hierarchical clustering in phylogenetics. -
C. The raw data is a similarity matrix and the initial tree is rooted.
Correct. UPGMA starts with a similarity (or distance) matrix and produces a rooted tree. -
D. It permits lineages with largely different branch lengths and corrections for multiple substitutions.
Incorrect. UPGMA assumes equal rates (molecular clock) and does not accommodate different branch lengths or correct for multiple substitutions.
Conclusion
The correct properties of a UPGMA tree are statements B and C.
Correct answer: (2) B and C
-


