Scientists from three countries announced yesterday that they had jointly determined the piece-by-piece order of virtually all of the 34 million chemical "letters" that spell out the genetic code of an entire human chromosome.
The achievement provides the first complete molecular script for a single human chromosome, 23 pairs of which carry the estimated 80,000 genes that provide the instructions for constructing a human body.
Researchers around the world praised the advance as a technical tour de force, a triumph of international scientific collaboration, and evidence that the publicly funded Human Genome Project is on track to sequencing by 2002 all of the 3.2 billion units of genetic code that together describe how to make a person.
Scientists hope that by noting "spelling errors" in the chromosomes of people with various diseases, they will be able to understand the molecular underpinnings of those ailments and develop new therapies. But first they must determine the normal sequences for each chromosome, as they now have for chromosome 22, the second smallest of the human chromosomes.
"For the hundreds of genes that are on this chromosome, we now have their anatomy," said Francis Collins, chief of the National Human Genome Research Institute, which funded most of the U.S. contribution to the project. "And for all the medical conditions [that are related to genes on that chromosome], we'll be that much further along understanding them."
About three dozen diseases are known to be caused by faulty genes on chromosome 22. Most are rare, such as DiGeorge syndrome, which involves abnormal development of the immune system, and there is indirect evidence that a gene on chromosome 22 may contribute to the risk of schizophrenia. But with scientists expecting to find a total of 700 or 800 genes embedded in the chromosome's code, scores of other diseases may eventually be shown to have their roots there.
"I can tell you it is no fun chasing down a disease gene when you're in territory no one has ever been in before," Collins said at a Washington news conference yesterday. For scientists hunting for important genes on chromosome 22, he said, having the complete sequence in hand is "a dream come true."
Chromosomal sequence analysis also may provide insights into human evolution, because large chunks of molecular text in human chromosomes are very similar to certain DNA chunks in animal chromosomes. A comparison of those molecular scripts could reveal how and when various organisms branched off from one another and diverged over millions of years.
"It's like seeing the surface or the landscape of a new planet for the first time," said Mark Patterson, an editor at Nature, the scientific journal that is publishing the chromosome report today. "It's allowing us to say something about its geography, and it's also allowing us to say something about its history, . . . how the chromosome evolved."
A chromosome, like a skein of yarn, is a single long strand of DNA folded and refolded into a plump, cinch-waisted bundle inside a cell. The DNA strand is made of four chemical subunits--the four "letters" of the genetic alphabet--strung together by the millions. Anywhere from 1,000 to a half-million of those letters typically make up a gene, and the order of the letters within the gene determines what the gene does in the body.
The exact meaning of chromosome 22's molecular sequence remains mostly a mystery. In some ways, it's like a long run-on sentence in a foreign language for which only a modest number of words have been translated into English. Scientists understand what the translated words mean, and they now know the remaining letters, but most of it remains untranslated and therefore the meaning remains unknown. Moreover, only about 3 percent of the script is believed to encode working genes, with most of the rest of the sequence being "junk" DNA, the function of which remains unknown.
But with chromosome 22's complete script in hand--and with similar texts already being translated in other organisms, such as the mouse and the fruit fly--scientists said it won't be long before they become fluent in the language of human genes and begin to make profound discoveries about embryo development, genetic diseases and aging.
The scientific report lists more than 200 scientists as co-authors, an astonishing number that Nature editors said was probably a record. Four institutions cooperated in the venture: the Sanger Center in Cambridge, England; Keio University School of Medicine in Tokyo; the University of Oklahoma in Norman; and Washington University in St. Louis.
Technically, the sequence is not 100 percent complete. Small gaps, constituting a total of about 3 percent of the chromosome's "letters," have proven impossible to decode, in some cases because they contain confusing strings of repeating motifs. And one specialized end of the chromosome, the DNA of which helps cells make another kind of genetic material called RNA, is not included in the new analysis.
But according to a 1996 international convention, a chromosome that is at least 95 percent sequenced and whose remaining gaps are small and well delineated can be deemed "complete." Various analyses showed that the completed sequence contains less than one error for every 50,000 letters.
Researchers on the chromosome 22 project emphasized yesterday that the results of their work were made freely available to other scientists on the World Wide Web every day as they accumulated. And in stark contrast to some privately financed efforts--most notably, that of controversial geneticist and entrepreneur J. Craig Venter of Celera Genomics Corp. in Rockville--they said no one on the team had filed for patents on the genes they had found.
The Genetic Alphabet
Every cell of every living organism contains genetic material made up of the compounds adenine, thymine, cytosine and guanine, represented as A, T, C and G. The precise order of these compounds regulates every aspect of an organism's life, from conception to death.