Button to scroll to the top of the page.


From the College of Natural Sciences
Font size: +

Biologist Uses Supercomputer to Study Fast Evolving Geraniums

Biologist Uses Supercomputer to Study Fast Evolving Geraniums

With a grant from the NSF's Plant Genome Research Program, Bob Jansen is applying next-generation DNA sequencing methods to better understand why the geranium has evolved to be so radically different from other plants.

geranium-imageGeraniums are actually natural mutants, evolving many times faster than their plant peers, according to Robert Jansen, professor of integrative biology.

Behold the geranium: mainstay of the home garden. These colorful bundles of blooms are actually quite unique, evolving many times faster than their plant peers, according to Robert Jansen, professor of biology at The University of Texas at Austin.

"The degree of change in this group is off the charts," said Jansen. "It's a situation where you have a natural set of mutants."

Most mutations are caused by coding errors that occur during cell division, when the DNA unravels and is copied. More often than not, however, when an error occurs it is repaired quickly. Heritable mutations are fairly rare, but in the Geraniaceae family, they are common. Why is this the case?

Geraniums (the Geraniaceae family) are unusual for a couple of reasons. For one, the organization of the chloroplast genome and the genes within it, are highly rearranged in comparison to other plants. Second, the rates of change for certain gene sequences, especially some functional groups of genes, are highly elevated in both the chloroplast and mitochondrial genomes. Geraniums are one of only two plant groups known to have such mutable genomes, making this garden ornamental a model species for scientific study.

Recently, with a grant from the National Science Foundation through the Plant Genome Research Program, some of the leading geranium scholars began applying next-generation sequencing methods to better understand why the geranium has evolved to be so radically different from other plants.

"There seems to have been repeated bursts of change," Jansen said. "It may be an on-going process, but it certainly has happened at different times in different lineages within the group, so we're taking a comparative approach."

In the coming months, the scientists will sequence genomes from dozens of geranium as well as some closely related rosids, whose evolutionary rates are normal. They will compare the genes involved in recombination and DNA repair in geraniums relative to their close relatives to identify key differences that may be causing unchecked mutations.

Intergenome Cooperation

The geranium's rapid evolution leads to another mystery. Like most plants, Geraniaceae have genomes in three separate compartments: nucleus, mitochondrion and chloroplast. The nuclear genome is large and complex, with as many as 30,000 genes and an intricate repeating structure. It does the bulk of the work in the cell. The mitochondrial and chloroplast genomes, on the other hand, are much smaller—on the order of tens of genes—with specialized functions.

To find out how this coevolution occurs, Jansen teamed with Jeff Palmer at Indiana University and Jeff Mower at the University of Nebraska to sequence all three genomes of dozens of species of geranium. The project is in year two of a five-year study.

The researchers are currently gathering sequence data and assembling and analyzing it with the assistance of the Ranger supercomputer at the Texas Advanced Computing Center (TACC). They are hopeful that the results will help explain how multiple genomes coevolve and why geraniums mutate so quickly. It turns out that the genomes do not act independently in plants; they cooperate. In fact, several proteins made by plant cells consist of multiple sub-units, each produced by a different genome within the cell. For an organism to survive, the separate genomes would need to develop in tandem, or coevolve. If the gene that produced part A of a protein changed so that it could not bind to part B, the protein could become non-functional and the organism may die. But the geranium and its unique genetic makeup have thrived for millions of years.

New Tools Bring New Challenges

The technologies that researchers use to sequence and analyze genetic data are only a few years old and the scale of the information involved is massive. Before Jansen and collaborators could start interpreting the genomic data, they needed to determine the most efficient way to gather it.

"We first went through the literature to see what everybody thought we should do and there was absolutely no consensus," Jansen said. "Many of the aspects of the sequencing and analysis hadn't even been compared."

Basic questions needed to be answered: Which sequencing platform works best for this type of problem? Which algorithm is fastest and most accurate for assembling sequences? And how much information is needed to find significant factors in the evolution of the genome?

A recent analysis by Jansen and his colleagues explored these questions and advanced the researchers' quest for the optimal experimental setup. They found that by using the Illumina HighSeq 2000 platform (a next-generation sequencer) in tandem with Trinity (a leading assembly tool), they were able to achieve the most accurate and efficient results. They also determined that roughly 40% of the sequence data was needed before they reached a plateau of useful information to assemble a complete transcriptome.

"We had no idea how much data we needed and the more data you have to gather the more expensive it is," Jansen said.They established this percentage by taking increments of a huge amount of data — about 14 billion sequence reads — from 5% up to 100%, assembling those different increments, and using a reference genome to see how many more genes they found and how the coverage of each improved.

Supercomputers like TACC's Ranger speed up sequence analyses by breaking the process down into small chunks and distributing them to thousands of computer processors working together. In the case of Jansen's project, Ranger also acted as a test-bed for method development, allowing the researchers to compare multiple experimental approaches to find the best one.

"For each species that we're looking at, we get all of these DNA or RNA sequences and we have to assemble these short reads into a complete genome, or into complete transcriptomes. This takes lots of memory and space," Jansen said. "The bottom line in our case—we could not do it without TACC."

Identifying Genetic Differences

Above and beyond the specific evolutionary history of the geranium, the researchers are hoping their investigation will uncover basic facts about evolution. They speculate that the high levels of rate change occurring in this group might have something to do with genes that are involved in DNA repair and recombination.

"Experimental evidence demonstrates that if you mutate the recombination genes, you can generate instability in the genome," Jansen said. "We're hoping to uncover some evidence that this phenomenon is related to those classes of genes."

Understanding how plant genomes evolve, interact with each other, and coordinate functions may seem obscure, but a general model of the division of labor within plant cells and their shared genomic functions could eventually lead to practical applications.

"We use evolution for lots of purposes agriculturally. We select for certain features in crop plants to have bigger ears of corn or bigger tomatoes," Jansen said. "If you don't understand the genes that are involved in that and how they work, it's hit or miss with regard to whether you're doing the right thing."

Written by Aaron Dubrow, Science and Technology Writer

Professor Wins Media Award for Popular Website abo...
Student Profile: Going Global with Damilola Olatay...


No comments made yet. Be the first to submit a comment
Already Registered? Login Here
Tuesday, 07 December 2021

Captcha Image