code hypothesis

The Genetic Code

Statment of the Problem - Cracking the Genetic Code

From the double helix model of DNA it was possible to visualize a model for replication of the DNA, but it was not easy to visualize the synthesis of proteins. When Jacob, Monod and Lwolf proposed that RNA was an intermediate it was easy to visualize the transcription of RNA from DNA. Translation was a much larger step.

Translation involves the transfer of information from a four letter alphabet to a twenty letter alphabet. Until it was all solved we did not know the exact number of amino acids in proteins, because of the great amount of postranslational modification.

The coding capacity of some systems are shown in the following table:

# of Letters

# of letters/word

formula

# of words

4

1

4**1

4

4

2

4^**2

16

4

3

4^**3

64

26

1

26^**1

26

26

2

26**²

676

26

3

26**³

17,576

26a

4

26**⁴

456,976

26

5

26**⁵

11,881,376

20b

100

20**¹⁰⁰

1.27 x 10¹³⁰

a Since the "American Heritage Dictionary" contains only 200,000 words, we can conclude that there is no need for five letter words.

b If we take this line of analogy to the next level and define the protein as a word of 100 amino acids then there would be 20^**100 or (1.27 x10**130) possible proteins.

Conclusion: If the code were a dublet, then there would be not be sufficient information to code for the 20 amino acids. If the code were triplet, then then there is a redundancy that is called degeneracy. Francis Crick, using genetic analysis, and H. Gobind Khorana, using molecular biology, independently proved that the code was triplet.

Text iGenetics by Peter J. Russell

This web site is provided for instruction in Botany and Zoology 342

by Kenneth G. Wilson,
Professor of Botany
Miami University
wilsonkg@muohio.edu

# of Letters	# of letters/word	formula	# of words
4	1	4**1	4
4	2	4^**2	16
4	3	4^**3	64
26	1	26^**1	26
26	2	26**²	676
26	3	26**³	17,576
26a	4	26**⁴	456,976
26	5	26**⁵	11,881,376
20b	100	20**¹⁰⁰	1.27 x 10¹³⁰