Evolution has sometimes behaved more like a weekend inventor quickly assembling devices from old parts than a high-tech engineer painstakingly designing new machines from scratch.
Molecular biologists have found dramatic evidence that at least some of the genes that govern human cells arose through an evolutionary process that duplicated segments of pre-existing genes and reshuffled them into new combinations that perform entirely different functions.
In other words, the extraordinary molecular complexity needed to make the full set of human genes could have been achieved through a vastly simpler, shorter series of events than many scientists had supposed.
Some estimate that evolution by module shuffling could have proceeded a million to 100 million times faster than by the previously supposed process of accumulating thousands of small mutations.
The findings involve human genes, but biologists say the same thing probably happened in the evolution of many species, if not most.
The discovery of modular genes, reported in the May 17 issue of Science, released today, confirms a speculation in the late 1970s by Walter Gilbert, then of Harvard University. Gilbert, who won a Nobel Prize in 1980, offered segment shuffling as an explanation for newly discovered nonsense segments called introns that interrupted the linear sequence of sense-making code called exons that constitutes a gene.
The principal components of every living cell are thousands of different kinds of protein molecules, some acting as structural members, others as enzymes or hormones of other agents of metabolism. The protein molecules also have distinct components -- amino acids, each with its own shape and chemical properties -- and their sequence determines the molecule's final form.
The creation of every protein molecule is determined by a gene residing in the nucleus of a cell. A gene is, in effect, a string of instructions made up of thousands of genetic "bases" -- the equivalent of words.
Gilbert suggested that the gene's blocks of gibberish serve to separate blocks of useful code, and that each of those useful segments or modules governs one component of the protein molecule.
The gibberish segments, researchers found then, are edited out of the copy of the gene's instructions that controls the cell's working machinery. The gene's linear instructions then are obeyed, one word at a time, linking the specified amino acids into a long chain to form a new molecule.
Gilbert noted that the gibberish segments are much longer than the adjoining sense-making segments. He proposed that when genes accidentally break apart -- as happens naturally -- the break is therefore much more likely to occur in the gibberish, leaving functional modules on either side.
When the parts recombine into a new gene -- which also happens naturally -- the gene's product will be a new form of protein molecule made up of tried-and-true components from other proteins.
Gilbert, who left Harvard to lead a Swiss-based biotechnology firm, Biogen, and is now on its board, called the Dallas finding "a dramatic example" confirming his early speculation.
Today's report is by four molecular biologists at the University of Texas Health Science Center and the Southwestern Medical School, both in Dallas. They are Thomas C. Suedhoff, Joseph L. Goldstein, Michael S. Brown and David W. Russell.
The four reported their discovery that several functionally unrelated human genes contain similar sense-making segments. The similarities are not total but are close enough that a common origin is inescapable, the scientists say. It appears that once the gene segments were duplicated and shuffled, each underwent minor mutations that did not alter its structural result.
The scientists described one gene as a mosaic of segments originally derived from several other genes. The mosaic gene provides the code for a protein molecule that lines pits on the outer surfaces of cells. The protein is a receptor for another protein, called low-density lipoprotein or LDL, that carries cholesterol in the blood.
The Dallas researchers worked out the genetic code for the LDL receptor -- a string of some 45,000 genetic bases or "letters" and found that it consisted of 18 sense-making segments.
A computer compared the sequences with those known for other human genes and found that 13 of the segments had close matches in other genes. Five were close to segments in an immune system protein called C9 complement. The other eight corresponded to segments in a protein that is a precursor to epidermal growth factor, and three of these eight also were shared by proteins in blood that promote clotting.
The four scientists suggested the genes arose when the ancestral gene was duplicated -- something genes do readily -- and broken up. Because cell nuclei, in which genes reside, contain enzymes that splice gene pieces, the itinerant fragments easily could have recombined in new patterns.