Publication date: 25 mei 2022
University: Wageningen University
ISBN: 978-94-6447-102-1

Impact of preselection in genomic evaluations

Summary

The development of genomic evaluation models over the last two decades has resulted in more accurate estimation of breeding values of animals, compared to when only pedigree-based genetic evaluation models were used. In large animal breeding programs, selection of parents of the next generation usually takes place in multiple stages. The initial stages of selecting parents of the next generation are collectively called preselection. Preselection takes place when selection candidates are young, sometimes even before they have records for any breeding goal trait. As the selection candidates grow older, they generally get records for more breeding goal traits, and they are re-evaluated in subsequent evaluations to select the final set of parents of the next generation. Impact of preselection on accuracy and bias of subsequent pedigree-based evaluation is known, but this is not the case for subsequent genomic evaluation. Role of genotypes from preculled animals, i.e. animals removed from the breeding program at preselection stage, in subsequent genomic evaluation of their preselected sibs is also poorly understood. In this thesis, I investigated the impact of preselection on accuracy and bias in subsequent genomic evaluation of preselected animals, using single-step genomic best linear unbiased prediction (ssGBLUP) as the representative genomic evaluation model.

In Chapter 2, I used simulated datasets to investigate, for different heritabilities, the impact of types and intensities of preselection on accuracy and bias in ssGBLUP evaluation of preselected animals. A trait was simulated with heritabilities of 0.1, 0.3, and 0.5, and the types of preselection implemented were genomic, parental average, and random preselection. I implemented three intensities of preselection, ranging from no preselection to preselecting 5% of male and 12.5% of female selection candidates. Subsequent ssGBLUP evaluation of preselected animals was always performed excluding genotypes of preculled animals. I showed that preselection, regardless of its type and intensity, and heritability, results in accuracy loss in subsequent ssGBLUP evaluation of preselected animals, compared to a scenario without preselection. I also explained that the accuracy loss is mainly due to loss of relatives with records, and/or reduction in heritability. I also showed that ssGBLUP estimates genomic estimated breeding values (GEBV) of preselected animals without preselection bias, regardless of type and intensity of preselection, and heritability. The results of this chapter also showed that if ssGBLUP is used in subsequent evaluation of genomically preselected animals, realized genetic gain is only slightly lower compared to a scenario without preselection.

In Chapter 3, using part of the simulated data used in Chapter 2, I investigated the roles of genotypes and phenotypes from various groups of animals in preventing preselection bias in subsequent ssGBLUP evaluation of preselected animals. In other words, I established the minimum information required in subsequent ssGBLUP evaluation of preselected animals to estimate GEBV of genomically preselected animals without preselection bias. I showed that to prevent preselection bias it is sufficient to supply the model with i) data of the reference population used in the evaluation at preselection stage and ii) genotypes and phenotypes of the preselected animals. I also showed that genotypes of preculled animals are only needed in subsequent ssGBLUP evaluation of their genomically preselected sibs if some of their parents are not genotyped and included in the reference data.

Although in Chapter 2 I showed that ssGBLUP in subsequent evaluation estimates GEBV of preselected animals without preselection bias, there are unpublished reports that bias is observed in subsequent ssGBLUP evaluation of preselected animals in commercial breeding programs. So, in Chapters 4 and 5, I used datasets from a commercial pig breeding program to verify whether what I found using simulated datasets holds in reality as well. In Chapter 4, I investigated the impact of genomic preselection (GPS) on accuracy and bias in subsequent ssGBLUP evaluation of preselected animals, for widely-recorded traits – traits that are routinely measured on the majority of animals in a breeding population. The traits were average daily gain, backfat thickness, and loin depth. I used the full data provided by the commercial pig breeding program as control, and retrospectively implemented additional layers of GPS. After subsequent evaluation, I compared accuracy and bias of subsequent ssGBLUP evaluation after these additional layers of GPS with accuracy and bias of ssGBLUP evaluation of the data as I received it from the commercial breeding program. Results for all traits showed only marginal loss in accuracy due to the additional layers of GPS. Bias was largely absent, and when present did not increase with more intense preselection. These results show that even in real animal breeding programs, ssGBLUP in subsequent genetic evaluation estimates GEBV of preselected animals without preselection bias. I suggested that, since the bias that was sometimes observed in subsequent ssGBLUP evaluation did not increase with more intense preselection, it was most-likely caused by something else.

It is generally known that prediction accuracy is higher, and probability of bias is lower, in genetic evaluation of animals for widely-recorded traits than for scarcely-recorded traits - traits only measured on a small proportion of animals in each generation. To verify whether ssGBLUP in subsequent evaluation is able to estimate GEBV of preselected animals without preselection bias for all categories of traits, in Chapter 5 I repeated what I did in Chapter 4, but now using scarcely-recorded traits. The scarcely-recorded trait I used was feed intake, and it was also the target trait in this chapter. The widely-recorded traits I used in Chapter 4 were genetically correlated with feed intake, so they could be used as predictors of feed intake. So in Chapter 5, I performed the subsequent ssGBLUP evaluation of preselected animals using records of the scarcely-recorded target trait, records of widely-recorded predictor traits, or records of both the scarcely-recorded target trait and widely-recorded predictor traits. Just like in Chapter 4, only marginal loss in accuracy due to the additional layers of GPS was observed. Bias was also largely absent, and when present, did not increase with more intense preselection. The above results were observed whether records of the scarcely-recorded target trait, of the widely-recorded predictor traits, or of both the scarcely-recorded target trait and the widely-recorded predictor traits were used in the subsequent ssGBLUP evaluation. These results show that even for scarcely-recorded traits, ssGBLUP in subsequent genetic evaluation estimates GEBV of preselected animals without preselection bias.

Finally, in Chapter 6, I explained and illustrated the mechanism that enables ssGBLUP and other genomic models in subsequent evaluation of preselected animals to minimize accuracy loss and bias associated with preselection, even when the information used in the subsequent genomic evaluation does not include all the information used at preselection stage. This mechanism is the fact that ssGBLUP uses genomic information to estimate the on-average positive Mendelian sampling terms of preselected animals. I also made inferences on subjects not directly covered in thesis. Specifically, I discussed that even for traits with very low heritabilities such as reproduction traits, ssGBLUP in subsequent evaluation is expected to estimate GEBV of preselected animals without preselection bias. I also discussed why ssGBLUP in subsequent evaluation should be able to estimate GEBV of preselected animals without preselection bias even if the preselection intensity is higher than what is currently implemented in commercial animal breeding programs. Finally, I recommended that commercial animal breeding programs should genotype as many young selection candidates as economically possible, so that preselection can be as accurate as possible. This will in the end ensure that loss of genetic gain as a result of preselection is minimized.

See also these dissertations

We print for the following universities