Why do all cells have the complete genome? [Pharyngula]


Ophelia has summarized a series of science questions Richard Dawkins asked on Twitter. Hey, I thought, I have answers to lots of these — he probably does, too — so I thought I’d address one of them. Maybe I can take a stab at some of the others another time.


I like this one, anyway:



Why do cells have the complete genome instead of just the part that’s needed for their function? Liver cells have muscle-making genes etc.



My short answer: because excising bits of the genome has a high cost and little benefit, and because essentially all of the key exaptations for multicellularity evolved in single-celled organisms, where modifying the DNA archive would have serious consequences for all the daughter cells.


This is an interesting issue, though: different kinds of cells in the same organism express genes that are qualitatively and quantitatively different. Here’s a set of nice graphs in which the relative fraction of different classes of genes in gene transcripts in different cell types were measured. Notice in the list of biological processes that a lot of them, such as the genes involved in transcription and translation and metabolism, are going to be used in all cells, but some, such as neuron-specific or testis-specific genes, are only going to be expressed in some cells.


(A) Pie graphs show estimated fraction of cellular transcripts deriving from genes belonging to a set of top-level Gene Ontology Biological Process categories for 7 human tissues and 1 cell line. Fractions were estimated from read density (RPKM) of Ensembl transcripts for each gene. Names of categories, distribution of transcriptome fraction across the samples (each line is a sample), and the coefficients of variation are shown at right. Biological processes with significantly higher or lower densities in individual tissues and cell lines are denoted by arrows. (B) FRACT analysis of sub-categories of the top-level ‘Development’ category in brain and testes.

(A) Pie graphs show estimated fraction of cellular transcripts deriving from genes belonging to a set of top-level Gene Ontology Biological Process categories for 7 human tissues and 1 cell line. Fractions were estimated from read density (RPKM) of Ensembl transcripts for each gene. Names of categories, distribution of transcriptome fraction across the samples (each line is a sample), and the coefficients of variation are shown at right. Biological processes with significantly higher or lower densities in individual tissues and cell lines are denoted by arrows. (B) FRACT analysis of sub-categories of the top-level ‘Development’ category in brain and testes.



It also gets complicated because some genes are found in very different forms: there is a kind of universal myosin, myosin I, for instance, that is expressed in all cells as part of the intracellular transport machinery, and then there is a myosin variant, myosin II, that is expressed only as a part of the contractile machinery in muscle. So you might think that it would be more efficient for a skin cell to simply cut out and throw away Myosin II, since it’ll never use it, and keep Myosin I.


But how does the cell determine which genes it will never use? Where does it draw the line? All those testis development genes, for example — I never used many of them until I hit puberty. Wouldn’t it have been terrible if my young toddler testicles threw out a set of unused genes, and then a dozen years later discovered that they had a use, after all? There are a great many genes regulated by timing and signals, and as can be seen in that figure above, every cell has a different expression profile. There are a variety of cells in my skin that are busy replicating and making keratin proteins as a matter of course, but they only switch on cellular repair mechanisms if I cut myself. There are also many genes that get reused in complicated ways, too: the gene even-skipped is first switched on as part of the segment forming process in flies, but it later is switched on again in making neuroblasts, and later still is expressed in axons during pathfinding. Cells would rather recycle genes than throw them away.


These properties are not unique to us mammals, either. Bacteria regulate which genes are turned off and on, too — they change their biochemical behavior in response to signals in their environment. The ability to switch on and switch off genes, without eliminating the DNA, is a solved problem. Life figured that one out a few billion years ago. Key molecules required for multicellular patterns of gene expression first evolved in bacteria — they worked out how to have a cell with the same genetic material behave differently in different circumstances. We came about ready made with a toolkit equipped to have one set of genes turned on in livers, and a different set turned on in muscles, easy.


But, you might think, wouldn’t it be so much more cost-efficient if cells in multicellular organisms just got rid of genes they’d never turn on in their lifetime, once they’ve committed to a certain tissue type? Muscle cells will never make sperm recognition proteins, and liver cells won’t ever have to lift weights, and you could probably cut the amount of DNA in differentiated cells in half with no effect on function.


But that’s penny-wise accounting. In bacteria, only about 2% of the cell’s energy budget is invested in replication — so removing a bit of DNA here and there is only going to shave a tiny amount off the cost of cell division. On the other hand, an amazing 75% is spent on transcribing and translating genes, so efficient mechanisms of simply turning off unused genes reaps huge savings for the cell. Evolving a complex process to pare away unused DNA in terminally differentiated cells simply does not make sense energetically, while simply taking advantage of an already fully implemented and refined process for regulating gene expression…heck, that’s what evolution does best, reusing what’s already there.


By the way, not all cells carry the complete genome: there are also a few cases where the DNA of an organism is modified — the CRISPR system in bacteria, and the somatic recombination system used in vertebrates to generate diverse immunoglobulins. In both of those cases, though, it’s not a mechanism to cut away unused DNA. It’s a specialized process to create variation during an organism’s lifetime to cope with environmental challenges.




Lane N, Martin W. (2010) The energetics of genome complexity. Nature 467(7318):929-34.


Ramsköld D1, Wang ET, Burge CB, Sandberg R (2009) An abundance of ubiquitously expressed genes revealed by tissue transcriptome sequence data. PLoS Comput Biol 5(12):e1000598. doi: 10.1371/journal.pcbi.1000598.



No comments:

Post a Comment