Statment of the Problem - Cracking the Genetic
Code
From the double helix model of DNA it was possible
to visualize a model for replication of the DNA, but it was not easy to visualize
the synthesis of proteins. When Jacob, Monod and Lwolf proposed that RNA was
an intermediate it was easy to visualize the transcription of RNA from DNA.
Translation was a much larger step.
Translation involves the transfer of information
from a four letter alphabet to a twenty letter alphabet. Until it was all solved
we did not know the exact number of amino acids in proteins, because of the
great amount of postranslational modification.
The coding capacity of some systems are shown
in the following table:
# of Letters |
# of letters/word |
formula |
# of words |
4 |
1 |
41 |
4 |
4 |
2 |
42 |
16 |
4 |
3 |
43 |
64 |
26 |
1 |
261 |
26 |
26 |
2 |
262 |
676 |
26 |
3 |
263 |
17,576 |
26a |
4 |
264 |
456,976 |
26 |
5 |
265 |
11,881,376 |
20b |
100 |
20100 |
1.27 x 10130 |
a Since the "American Heritage
Dictionary" contains only 200,000 words, we can conclude that there is no need
for five letter words.
b If we take this line of
analogy to the next level and define the protein as a word of 100 amino acids
then there would be 20**100 or (1.27 x10**130) possible proteins.
Conclusion: If the code were a dublet, then there
would be not be sufficient information to code for the 20 amino acids. If the
code were triplet, then then there is a redundancy that is called degeneracy.
Francis Crick, using genetic analysis, and H. Gobind Khorana, using molecular
biology, independently proved that the code was triplet.
|