The Genetic Code
Statment of the Problem - Cracking the Genetic Code
From the double helix model of DNA it was possible to visualize a model for replication of the DNA, but it was not easy to visualize the synthesis of proteins. When Jacob, Monod and Lwolf proposed that RNA was an intermediate it was easy to visualize the transcription of RNA from DNA. Translation was a much larger step.
Translation involves the transfer of information from a four letter alphabet to a twenty letter alphabet. Until it was all solved we did not know the exact number of amino acids in proteins, because of the great amount of postranslational modification.
The coding capacity of some systems are shown in the following table:
# of Letters |
# of letters/word |
formula |
# of words |
4 |
1 |
4**1 |
4 |
4 |
2 |
4**2 |
16 |
4 |
3 |
4**3 |
64 |
26 |
1 |
26**1 |
26 |
26 |
2 |
26**2 |
676 |
26 |
3 |
26**3 |
17,576 |
26a |
4 |
26**4 |
456,976 |
26 |
5 |
26**5 |
11,881,376 |
20b |
100 |
20**100 |
1.27 x 10130 |
a Since the "American Heritage Dictionary" contains only 200,000 words, we can conclude that there is no need for five letter words.
b If we take this line of analogy to the next level and define the protein as a word of 100 amino acids then there would be 20**100 or (1.27 x10**130) possible proteins.
Conclusion: If the code were a dublet, then there would be not be sufficient information to code for the 20 amino acids. If the code were triplet, then then there is a redundancy that is called degeneracy. Francis Crick, using genetic analysis, and H. Gobind Khorana, using molecular biology, independently proved that the code was triplet.
Text iGenetics by Peter J. Russell
This web site is provided for instruction in Botany and Zoology 342
by Kenneth G. Wilson,
Professor of Botany
Miami University
wilsonkg@muohio.edu