Consensus Sequence Determination - CSIR NET LIFE SCIENCE COACHING | NTA NET LIFE SCIENCE

admin
January 16, 2026
No Comments

23. The transcription factor X binds a 10 base pair DNA stretch. In the DNA of an organism, X was found to bind at 20 distinct sites. An analysis of these 20 binding sites showed the following distribution:

Base	1	2	3	4	5	6	7	8	9	10
A	11	0	0	0	16	2	4	0	4	3
T	3	0	19	0	0	1	3	20	2	4
G	4	20	0	0	2	4	6	0	12	2
C	2	0	1	20	1	11	6	0	2	11

What is the consensus sequence for the binding site of X?

(a) NGTCNNNTNN
(b) AGTCACNTGC
(c) CACCTANCTG

(d) ANNNAACGNGC

Introduction

Consensus sequences are widely used in bioinformatics and molecular biology to summarize conserved DNA motifs recognized by proteins such as transcription factors. This solved example explains how to extract a consensus sequence from a binding site table, interpret base frequencies, and analyze multiple-choice answers to select the correct DNA motif.

Question Recap

A transcription factor X binds a 10-bp DNA stretch.
Twenty binding sites were observed, and base frequencies at each position are given.

We identify the most frequent nucleotide at every position to derive the consensus.

Step-by-Step Consensus Derivation

Position	A	T	G	C	Consensus
1	11	3	4	2	A
2	0	0	20	0	G
3	2	19	0	1	T
4	0	0	0	20	C
5	16	1	2	1	A
6	2	3	4	11	C
7	4	4	6	6 (tie)	Ambiguous (N or M)
8	0	20	0	0	T
9	4	2	12	2	G
10	3	4	2	11	C

At position 7, no base decisively dominates
(C=6, G=6, A=4, T=4).
That leads to ambiguity.
Most exams assign N where no nucleotide clearly dominates.

Final Consensus

AGTCA C N T G C

Correct Answer

✔ (b) AGTCACNTGC

Option-by-Option Analysis

(a) NGTCNNNTNN

❌ Incorrect

First base should be A, not N
Too many Ns inserted—data show clear single-base majorities at most positions

(b) AGTCACNTGC

✔ Correct

Matches the majority base at 9 positions
Uses N at position 7 where no base strongly dominates
Exactly reflects the binding frequency table

(c) CACCTANCTG

❌ Incorrect

Position 1 starts with C instead of A
Several other mismatches show it doesn’t follow the frequency distribution

(d) ANNNACGNGC

❌ Incorrect

Replaces many clearly dominant positions with N
Data support strong consensus at most positions; only one should be ambiguous

Conclusion

Interpreting nucleotide frequency tables is a fundamental bioinformatics skill. By selecting the most frequent base at each position and using N only for ambiguous positions, we derive the correct consensus sequence:

⭐ AGTCACNTGC

This approach helps identify regulatory motifs in DNA, predict transcription factor binding sites, and analyze genetic control networks.

Introduction

Question Recap

Step-by-Step Consensus Derivation

Final Consensus

Correct Answer

✔ (b) AGTCACNTGC

Option-by-Option Analysis

(a) NGTCNNNTNN

(b) AGTCACNTGC

(c) CACCTANCTG

(d) ANNNACGNGC

Conclusion

⭐ AGTCACNTGC

Tags :

Chemostat Substrate Consumption Rate

Bacterial Genome Replication Time

Leave a Reply Cancel reply

Latest Courses

IIT JAM / CUET- PG

CSIR UGC NET Life Science Course

ICMR JRF Life Science