The Dance Steps of Genomics Part I: Understanding Genomic Prediction
by Jared Decker, PhD., University of Missouri Extension
Whenever I hear the words “two-step” my mind goes back to my teenage years when I danced with cute girls at the county fair. But, in this article we will be discussing a different type of two-step. There are two common methods, referred to as two-step or one-step, for including genomic (i.e. DNA) information into the genetic evaluations we use to estimate expected progeny differences (EPDs). Two-step genomic predictions use the genomic and pedigree information separately in two different prediction steps. In one-step genomic predictions, genomic and pedigree information is combined and used in a single step.
DNA INFORMATION IN BEEF BREEDING
Information from DNA testing has been used for a long time in beef breeding. For several decades, DNA testing has been used to verify parentage information. This is a yes or no answer of whether the parents reported by the beef breeder are in fact the calf’s parents. If the parents do not match, this DNA data can often be used to identify the true parents.
In 2010, DNA testing began to be used to accomplish much more for beef breeders. The use of DNA testing was used to create genomic predictions. Genomic predictions are much like other measures of genetic merit (i.e. EPDs) in that they predict the value of an animal as a parent of the next generation. But, genomic predictions use DNA data to accomplish this. Genomic predictions are combined with traditional EPDs to produce genomic-enhanced EPDs (GE-EPDs). GE-EPDs are more reliable than traditional EPDs, especially for young animals with little data available.
The first approach to using genomic information in the beef industry is called the two-step approach. In this method, thousands of animals with both phenotypes (trait records) and genotypes (DNA data) are used to “train” a genomic prediction. In this training process, the effects of thousands of DNA variants are estimated. The genomic prediction (molecular breeding value) is then the sum of the effects for the DNA variants an animal inherited from its sire and dam. This genomic prediction is then used as a correlated or indicator trait to increase the accuracy of the EPD. An indicator trait is an indirect measure of an economically important trait; some examples include ultrasound data as an indicator of carcass merit or birth weights as an indicator of calving ease. In two-step GE-EPDs, the genomic prediction is an indicator of an animal’s true EPD. In this approach, the pedigree EPD and the genomic prediction are calculated separately and then later combined to produce a GE-EPD.
There are a few limitations with two-step GE-EPDs. First of all, thousands of animals with phenotypes and genotypes are needed to train the genomic prediction. Second, there needs to be a second set of animals with phenotypes and genotypes to validate the genomic prediction and estimate how much of the genetic variance it is explaining. Third, an appropriate method to blend the genomic prediction with the pedigree EPD is needed.
The approach used by the dairy industry, and later adopted by several beef breeds is called single-step. In single-step GE-EPDs, the pedigree and genomic information is combined in a single step to calculate GE-EPDs.
The process of estimating EPDs is often seen as an unknown “black box” for many beef producers. And in many regards, this is completely acceptable. I carry a little black box, a smart phone, around in my pocket constantly. I use it to talk to people to ask and answer questions. I use it to look up information on bulls or calculate breeding dates and due dates. I use it to share or read information on the internet. But, I have no idea how a cell phone works. What I do know is that my cell phone works. If you take nothing else away from this article, please know that EPDs do work.
To better understand single-step, let’s peak inside the black box of EPD calculations. Using pedigree information, we estimate the relatedness between all of the animals in our genetic evaluation. Using the degree of relatedness between animals, we can separate the variation due to genetics from the variation due to different environments. This allows us to estimate the genetic merit of an animal. With the genetic merit (EPD) of the animal, we can make more accurate selection decisions based on inherited genetics. There is nothing tricky about estimating EPDs, it is simply using two forms of information, trait records and pedigree information, to separate performance into genetics and environment.
We know that animals get 50 percent of their chromosomes (strings of DNA) from their father, or sire, and 50 percent of their chromosomes from their mother, or dam. So, a calf is related 50 percent to its father and 50 percent to its dam. If we look at hundreds or thousands of calves, on average they share 25 percent of their DNA with their grandparents. The grandparent passes 50 percent of its DNA to the parent, and the parent passes 50 percent of its DNA to the calf. Fifty percent of 50 percent is 25 percent. So, in pedigree estimates of EPDs, all calves are assigned a relationship of 25 percent with their grandparents. But, for an individual animal, the amount of relationship with its grandparent can be quite different from 25 percent. There is a bell shaped curve around the average of 25 percent. Some animals share less than 25 percent with a particular grandparent. Some animals share more than 25 percent with a particular grandparent.
The beauty of single-step genomic prediction is that we use the DNA information to measure the actual amount of relationship between the animals in our evaluation. Because we are no longer using expected averages, but are instead using realized amounts, the GE-EPD becomes more precise than a pedigree EPD. Rather than waiting for progeny records to help us figure out the random shuffle of genes and chromosomes an animal inherited from its father and mother, we can use DNA information to figure out the random sample of DNA the animal inherited. In single-step GE-EPDs pedigree and genomic information is combined to estimate the relatedness of the animals in our evaluation.
Single-step also has limitations. First, we have to figure out how to express the relatedness measured by pedigree data and the relatedness measured by DNA data on the same base. Second, original single-step methods weighted each DNA variant equally, but new single-step methods are being developed to give more weight to important DNA markers. But, single-step approaches do not require a training and validation data set and all the available information is used simultaneously.
Genomic predictions are simply more data added to genetic evaluations. Whenever we add more data to a genetic evaluation, the EPD accuracy improves. GE-EPDs provide an easy way to quickly improve the EPD accuracy especially for young animals.
LEARN THE STEPS
In the next article in this series, we will discuss steps breeders can take to use GE-EPDs in their herds. Brangus breeders should be excited to use this technology to select and breed better cattle. Enjoy the dance!