The first genomes to be sequenced revealed something surprising: On a genetic level, we’re not that different from other species—even some very distantly related ones. What makes us human and them not? Biologist Greg Wray is learning that it’s not the genes that matter—it’s the way they are used.
Probably the last place you should look for Greg Wray is in his office. You might find him sitting in another professor’s guest chair, talking about sea urchins, baboons, or maybe lichen. He could be teaching a class about dinosaurs. Or perhaps he’s somewhere around the genome-sequencing facility he directs, a suite of high-powered equipment in the Biological Sciences Building that can read through the entire genetic code of an organism in less than a day. His assistant talks to him mostly through cell-phone texts.
What looks like distracted, omnivorous behavior is in fact a single-minded pursuit. Wray Ph.D. ’87, a fifty-one-year-old professor of biology at Duke, is after a big question surrounding the origin of species, one that has led him to collaborate with dozens of scientists while studying organisms as diverse as great white sharks and fire ants.
“This is the greatest time to be a biologist,” Wray begins during a rare moment in his own office on the fourth floor of the French Family Science Center. Like a lot of professors’ warrens, it’s lined floor-to-ceiling with books. But there is also a large collection of toy dinosaurs and two skulls, one a heavy-browed pre-human, the other a small, crocodile-like caiman. At the level of biology Wray cares most about, all of these creatures are basically the same. “When you drill down to the molecules and the genetics, it’s a lot of the same stuff,” he says. “We use urchins on one hand and primates on the other because they allow us to see different aspects of the same problem.”
The “problem” is how a relatively small set of genes—fewer than 25,000 in both humans and chimpanzees—can produce such complex and dramatically different organisms. For that matter, why do humans, with our spectacular brains and versatile digestive abilities, have only a handful more genes than a brainless, one-millimeterlong worm that eats nothing but bacteria?
The answer that Wray and a growing number of evolutionary biologists are pursuing is that our 25,000 genes are merely sheet music: It’s how that music is played that makes us different. Wray was among the first to argue, and then to document, that evolution acts on those “players”—the sections of DNA that regulate how genes are expressed—making them key drivers of the diversity of life.
All he had to do was find them.
In 1975, when Greg Wray was still in high school, Stanford biologists Mary Claire King and Allan Wilson compared a sampling of proteins and short portions of DNA from humans and chimpanzees—the best they could do with contemporary techniques—and published the remarkable finding that humans and chimps appeared to be 99 percent the same on the genetic level. This immediately raised an enormous question of how we could be so alike in our nuclei, and yet so different in behavior, diet, intellect, physiology, and body hair. King and Wilson proposed that the expression of genes, their on and off patterns, would account for much of the difference. But the tools just weren’t ready to address that question.
At the beginning of this century, computerized and robotic lab technology enabled the reading of complete genomes, first for the model species like yeast, nematodes, and fruit flies, and then for pathogens, crops, humans, and chimpanzees. A curious pattern emerged: Huge sections of the DNA didn’t describe proteins, the structural elements and chemical actors that sustain life. In fact, among the 3 billion letters of the human genome, only 2 percent of the DNA was found to code for proteins. Was the rest of the genome simply rough drafts and broken bits left over from the chaos of mutation and natural selection? Surely there had to be more to the system, but it wasn’t easy to see.
Yet a big part of the answer to this socalled “junk DNA” was already somewhat known. Every cell that has a nucleus carries a complete copy of the genome—all of the DNA required to build and operate an organism from fertilized egg to senescence. But all of those genes can’t be active all of the time in every cell; if they were, you’d have fingernails on the palm of your hand and hair on your gums. So it was clear that genes are carefully coordinated to work only when and where they are needed— and to stay quiet and out of the way where they aren’t—a process called expression. Gene expression makes an embryo into a fetus and then an adolescent using just one set of genes.
What wasn’t quite so obvious—until it became possible to look at millions of letters of DNA pretty much all at once—was that gene expression also helped create the mosaic of species around us. Wray was one of the first biologists to argue that natural selection could shape not just the genes themselves, but also the regulatory regions that orchestrate turning genes on and off. He reasoned that evolutionary changes in these “switches” could allow similar genomes to take on a radically different appearance from one individual to the next or one species to the next.
With a million or so of these switches controlling the expression of genes in interconnected circuits and feedback loops, biology has an exquisite tool for fine-tuning the organism, Wray says. An invading army of bacteria swarming through a cut in the skin triggers a chemical signal that causes millions of the host’s cells to swing into action, cranking out legions of bugfighting white blood cells and raising the body’s temperature. When the infection is defeated, the signal stops, the temperature drops, and the white blood cells dissipate. “The regulatory region of the DNA,” Wray says, “is like a scaffold on which different proteins come and sit down. Different combinations of these proteins turn on or off specific genes. They act as switches that regulate under which conditions a gene will be on or not.”
To see how natural selection might bring such a system to pass, imagine a tiger’s stripes, which are produced by gene expression turning on black pigment in some hair follicles and orange or white in others. If this system of expression didn’t work, the tiger would be without her camouflage and would probably capture fewer prey as a result. That in turn would mean fewer, weaker offspring, whereupon natural selection eventually would take the stripeless tigers out of the gene pool. But this is what biologists call a “just-so” story—a narrative that seems plausible but lacks real data. Wray’s colleagues demanded proof. “The challenge was getting the right kind of data together to convince the community,” he says.
In 2007, Wray’s team finally had some. Through a broad-brush comparison of the genomes of humans, chimpanzees, and macaques, they showed for the first time that more than 500 gene promoters— switches that turn on expression—were dramatically different among the three species. The alterations were well beyond the random-chance change one would find in areas of DNA that didn’t seem to matter, and they were particularly prevalent in genes that affect key differences among these primates, such as the brain and digestion. In other words, selection could be seen in the patterns of expression.
“It’s more about the control of the sequence, not the coding,” says Courtney Babbitt, who joined Wray’s lab as a postdoctoral researcher five years ago to study sea urchins and has gone on to dissect human brains and baboon ovaries in search of gene-expression differences. “The traits that we think are more interesting seem to be selected on non-coding regions.”
The study was a statistical tour de force, completed with the help of postdoctoral fellow Ralph Haygood, a physicist and engineer who has turned his considerable math talents to biology. Haygood has since gone off to start his own company to do this kind of analysis, but Wray and collaborator Olivier Fédrigo, associate director of Duke’s genome sequencing facility, are continuing to mine the data for human-tochimp comparisons. In October, they published a paper on a single regulatory difference that may explain why our brain is so much larger than a chimpanzee’s, while our muscles are so much weaker. The key may be glucose transporters, molecules that ferry sugar to provide energy wherever it’s needed in the body. Both chimps and humans have the same assortment of glucose- transporting proteins in our brains and muscles, but because of a difference in gene regulation, we make three times more of one transporter in our brains, and chimps make more of another transporter in their muscles. That translates to more fuel for our hungry brains and more for their hungry muscles.
Wray’s lab was also part of a 2007 paper that traced human lactose tolerance to a change in gene regulation. A single-letter alteration to a regulatory region confers the ability to make lactase, an enzyme for breaking the sugars in milk, into adulthood. “Every mammal can make it, because every mammal nurses,” Wray says. “But we’re the only mammal where individuals seem to be able to do it as an adult.” This is also very recent evolutionary development for our species, having occurred at least four different times among our ancestors in Northern Europe, East Africa, and the Middle East as they settled down with milk-giving domestic animals.
Fine-tuning expression may be a much better way for natural selection to operate, Wray argues. For example, a well-known protein mutation gives some humans resistance to malaria, but it also produces sickle-shaped red blood cells throughout the body, a debilitating anemia. “When the coding gene is mutated, you get those side effects everywhere and at all times that the protein is produced,” Wray says. On the other hand, a different malaria-resistance mutation occurs in a regulatory region, depriving the malaria parasite of the red blood cell protein it uses as a docking site. The protein is only missing in the blood cells; everywhere else it is needed, the body produces it normally. “If you do this through regulation, you can limit those side effects to a very specific set of circumstances. And all the rest of the time, everything’s cool. This mutation is just good, good, good.”
But finding more of these brilliant innovations in the DNA won’t be easy. Regulatory regions aren’t in any predictable spots on the long ladder of DNA. They don’t have the recognizable “start” and “stop” sequences that help scientists home in on coding genes. They number “a million- ish,” Wray says, and may come in a dizzying array of shapes and sizes.
For now, regulatory regions are defined operationally, by breaking DNA into millions of pieces to see how expression patterns change. “People have spent their entire careers studying just a couple of these regulatory regions of DNA and figuring out what they’re doing. Now we’ve got a million of them,” says Greg Crawford, an assistant professor in pediatrics and Duke’s Institute for Genome Sciences & Policy who is working with Wray on pinpointing regulatory regions.
The sequencing core that Wray directs in Biological Sciences is a warm, fifteenby- thirty-foot room filled with machines the size of dorm-room refrigerators that have names like exotic sports cars and cost more than a house in Chapel Hill. A typical “promotor bashing” experiment, which chops up DNA and tests the effect on gene expression, might generate 30 million to 50 million data points in a matter of hours. “Your laptop is not going to be able to handle this,” says Fédrigo, a compact Parisian who runs the day-to-day operations of the core. “The first thing I ask people is, ‘Do you know what you’re going to do with your data?’ ”
These next-generation sequencing machines are the children of the massive Human Genome Project, which required thirteen years and $3 billion—a dollar for each letter of DNA—to complete, but produced an array of new tools for biological research. The level of human ingenuity and brute force computing applied to this quest in the intervening decade is an evolutionary tale in itself. One of the machines in the Duke core uses a camera adapted from astronomy to pick out color variations among millions of infinitesimal spots of light on a glass slide that indicate whether a particular letter of DNA might be a C or a G. One stretch of DNA may be sampled ten or twenty times and then statistically rectified to reach a conclusion. It’s expensive and difficult, but trivial in comparison to the Human Genome Project. Fédrigo punches some numbers into his pricing spreadsheet and says that given two weeks and about $2,500, he could sequence a mouse genome, with roughly the same number of genes as humans, ten times over. “And I’m going to give you thirty to forty gigabytes of data.”
Biologists of an earlier era built their careers on a single model species—the fruit fly, the mouse, the E. coli bacterium, the nematode worm—painstakingly dismantling one creature over and over to figure out what each piece might mean. But the speed and power of next-generation genome sequencing has “opened the door to being able to work with pretty much any organism you wanted,” Wray says. “The time needed to jump in, spin it up, and ask usable questions went from decades to months.”
This also has created a need for a new kind of biologist, one creative enough to cross not only species boundaries but technological ones as well. “We’re in a transition period right now, where the older professors don’t know enough about this, but they can train their students,” Fédrigo says. For example, after joining Wray’s lab as a postdoctoral fellow, he had to learn how to write UNIX computer code to make the machines work. “Greg Wray is on the good side. He knows enough. He has never run an analyzer, but he knows the idea of it. He can advise other people on how to do it.”
Collaboration is key in this environment, Fédrigo adds. No one person can master both the technical skills and biological wonderment. And Wray collaborates with just about everybody who can help him see evolution through gene expression. “He has this capacity of understanding things very fast and capturing what is interesting when he talks to people,” Fédrigo says. “He can grab the big picture very, very fast. It’s kind of impressive.”
“He’s fun to collaborate with,” adds Susan Alberts, the Jack H. Neely Professor of biology. She’s working with Wray on gene expression in a troop of Kenyan baboons she’s been studying for more than thirty years. “He’s easygoing, engaging, responsive. And he has great ideas.”
Some of Wray’s adaptability may flow from his unusual upbringing. The son of American missionaries stationed in the foothills of the Himalayas in India, he grew up immersed in nature. “I watched very little TV as a kid,” he says. “I could either go outside and walk around, or I could read. I did both.” Except for eighth grade in Michigan, he studied at international schools that had been British private schools. At one point, he had fifty-six classmates from twenty-five different countries—an experience that may have given him a good feel for collaboration.
After an undergraduate biology degree at the College of William & Mary, Wray came to Duke for a Ph.D. under sea-urchin expert David McClay, the Arthur S. Pearse Professor of biology, who now occupies the office next to Wray’s in the French science center. His other major influence was Fred Nijhout, who studies developmental biology in butterflies. “I was working on butterflies with Fred and sea urchins with Dave, so I guess that kind of set the pattern.”
Wray's cast of research collaborators includes these Duke professors.
The pattern Wray has been pursuing is that Earth’s life is really all the same, despite the differences we think we see between bread mold and a bald eagle. It’s a complex but common language of molecules. “We’re almost thinking of the genome as a kind of grammar,” Wray says. At the molecular level, life doesn’t have the crisp logic of binary code, but it isn’t random, either. It’s something in between, a tangled mess of redundant and crosswired connections that would put Rube Goldberg to shame. Genes don’t just turn on and off, they operate at different levels, changing expression from minute to minute, from organism to organism, and even from cell to cell within the same organism. It’s chaos within parameters, and that’s what makes it so resilient and relentless, like a weed-filled lot.
“It is very hard to picture,” says Fédrigo. “We’re very binary and linear in our way of thinking, and [biology] is multidimensional. I think our human brain is not ready for that yet.”
And yet, we try. The diagrams biologists are developing to keep track of all of these interactions look for all the world like an integrated circuit or a pipefitting diagram for an oil refinery. The mechanisms of expression are a network of interlocking switches—what a logician or electrical engineer would call gates. “We know some of them wink on and off over evolutionary time, and those are probably modulating relatively small details, whereas others are probably absolutely essential,” Wray says.
The search for meaning among the interactions of a million or more switches in an organism may soon make the Human Genome Project seem trivial. But the biologists of the next generation are going to understand the origin of species in ways Charles Darwin wouldn’t have dreamed of. And perhaps they’ll be able to see human health and behavior in an entirely new way. Greg Wray’s name probably will be sprinkled liberally throughout that literature.
Grant applications and publication lists often portray a researcher’s career like a string of pearls. “Mine’s more like a charm bracelet,” Wray says, hurrying through a basement corridor of the biology building to his next meeting. “The beads don’t match.”
- Bates is director of research communications in Duke’s Office of News and Communications.
November 30, 2011