Overview
The organization of total sum of genetic information (or genome) of an organism is in the form of double-stranded DNA, except that viruses may have single-stranded DNA, single-stranded RNA or double-stranded RNA genomes.
In many viruses and prokaryotes, the genome is a single linear or circular molecule.
In eukaryotes, the nuclear genome consists of linear chromosomes (usually as a diploid set) and the mitochondrial and chloroplast (in plants) genomes are small circular DNA molecules.
Ever since the first general surveys of nuclear DNA content were carried out in the early 1950s, it has been apparent that eukaryotic genome sizes vary enormously and that this is unrelated to intuitive ideas of morphological complexity. This discrepancy between genome size and complexity remains clear more than half a century later, with genome sizes now available for nearly 9,000 species of animals and plants. In prokaryotes, genome size and gene number are strongly correlated, but in eukaryotes the vast majority of nuclear DNA is non-coding.
As further clarification, when scientists talk about the eukaryotic genome, they are usually referring to the haploid genome—this is the complete set of DNA in a single haploid nucleus, such as in a sperm or egg. So, saying that the human genome is approximately 3 billion base pairs (bp) long is the same as saying that each set of chromosomes is 3 billion bp long. In fact, each of our diploid cells contains twice that amount of base pairs. Moreover, scientists are usually referring only to the DNA in a cell's nucleus, unless they state otherwise. All eukaryotic cells, however, also have mitochondrial genomes, and many additionally contain chloroplast genomes. In humans, the mitochondrial genome has only about 16,500 nucleotide base pairs, a mere fraction of the length of the 3 billion bp nuclear genome
DNA Packaging in Eukaryotes
Eukaryotes are more complicated organisms than prokaryotes. Naturally, they have a larger , more complicated amount of genetic material. How is all this genetic material packaged into the tiny nucleus of the eukaryotic cell? What makes eukaryotic genomes much larger than prokaryotic genomes? What is a genome anyway?
Genomes
The organization of total sum of genetic information (or genome) of an organism is in the form of double-stranded DNA, except that viruses may have single-stranded DNA, single-stranded RNA or double-stranded RNA genomes.
In many viruses and prokaryotes, the genome is a single linear or circular molecule.
In eukaryotes, the nuclear genome consists of linear chromosomes (usually as a diploid set) and the mitochondrial and chloroplast (in plants) genomes are small circular DNA molecules.
Ever since the first general surveys of nuclear DNA content were carried out in the early 1950s, it has been apparent that eukaryotic genome sizes vary enormously and that this is unrelated to intuitive ideas of morphological complexity. This discrepancy between genome size and complexity remains clear more than half a century later, with genome sizes now available for nearly 9,000 species of animals and plants. In prokaryotes, genome size and gene number are strongly correlated, but in eukaryotes the vast majority of nuclear DNA is non-coding.
Compactness of the yeast, fruit-fly and human genomes
Feature | Yeast | Fruit fly | Human |
---|---|---|---|
Gene density (average number per Mb) | 479 | 76 | 11 |
Introns per gene (average) | 0.04 | 3 | 9 |
Amount of the genome that is taken up by genome-wide repeats | 3.4% | 12% | 44% |
Extensive variation in genome size within and among the main groups of life.
As further clarification, when scientists talk about the eukaryotic genome, they are usually referring to the haploid genome—this is the complete set of DNA in a single haploid nucleus, such as in a sperm or egg. So, saying that the human genome is approximately 3 billion base pairs (bp) long is the same as saying that each set of chromosomes is 3 billion bp long. In fact, each of our diploid cells contains twice that amount of base pairs. Moreover, scientists are usually referring only to the DNA in a cell's nucleus, unless they state otherwise. All eukaryotic cells, however, also have mitochondrial genomes, and many additionally contain chloroplast genomes. In humans, the mitochondrial genome has only about 16,500 nucleotide base pairs, a mere fraction of the length of the 3 billion bp nuclear genome
If you were to take one molecule of DNA from a human cell and stretch it out to its full length, it would be approximately two meters long. So it is truly incredible that such an enormously long molecule can be compressed into the microscopic space of the nucleus of a cell.
We will start from the very beginning of the packaging: the actual DNA molecule. It is first wrapped twice around a cluster of protein molecules called histones. This structure, a cluster of histones and two loops of DNA around it, is called a nucleosome. But this packing is not nearly enough to squeeze the tremendous DNA molecule into the nucleus. The nucleosomes are subsequently coiled together, and then this coil is arranged in tightly packed loops. This incredibly dense mass of loops and coils is the condensed chromatin
that you would see in the nucleus of a cell.
Chromatin has highly complex structure with several levels of organization.
Large eukaryotic genomes
The large sizes of eukaryotic genomes reveal large amounts of repeated DNA. The repeated DNA is present in two categories, 1) tandemly repeated DNA and 2) interspersed repeated DNA.
Tandemly repeated DNA (10-15% of mammalian genomes) is made up of rows of many copies of the same sequence. The repeated unit ranges from 1 to 2000 basepairs (bps) in length. Often the repeat is less than 10 bps and is referred to as simple-sequence repeated DNA or satellite DNA (due to centrifugation "satellite" bands). These may provide special physical properties to some stretches of the chromosome. Centromeres and telomeres are rich in simple-sequence repeated DNA. At a given site, the amount of simple-sequence repeated DNA may vary greatly.
In DNA minisatellites, the satellites may vary between 100 and 100,000 bps in length. DNA fingerprinting is used to distinguish individuals by analyzing microsatellites (repeats of 1 to 4 bps) which often differ by 10 to 100 bps. A number of human diseases are caused by having triplet repeat amplification such as in Huntington's Disease where 11-34 repeats of CAG in the Huntington's Disease gene is normal but ~50 to 100 results in the disease.
Interspersed repeated DNA make up 25 to 40% of most mammalian genomes that are hundreds to thousands of bps long. Many interspersed repeated DNA sequences are transposable elements.