Share this project
Structural variants in the bovine genome
Summary
The DNA is hereditary material that harbours numerous genetic variants inherited from an individual’s ancestors. Of these variants, a small fraction are spontaneously occurring de novo mutations that can persist in the population variants pool, if not lost by drift. Cattle is a livestock species with high economic significance; thus, deciphering how its genome and genomic information relates to its phenotype is crucial. However, investigation on the mutational processes and population variants with high impact in cattle, thus far, has been limited to SNPs. As a result, less tractable and complex variants, such as structural variants, despite affecting more bases than small variants, have not been deeply investigated. To close this gap, I analysed bovine structural variants, using data from genotyping arrays and multi-generational deeply sequenced genomes. This thesis generated catalogues of SVs segregating in dairy cattle populations, and demonstrates that SVs can have molecular and phenotypic impacts. In particular, I dissected the largest QTL for clinical mastitis and unravelling a 12-Kb multiallelic CNV as the causative variant. Furthermore, my work showed that the mutational mechanisms of SVs are likely inherently different compared to SNPs, highlighting the importance of a comprehensive survey of mutational processes.
In Chapter 2, I made a catalogue of CNVs, based on high-density SNP-array data generated from two dairy cattle breeds. I showed that CNV discovery results could vary, depending on the quality of reference genomes. Exploiting the allele frequencies of the CNVs, I highlighted that some CNVs likely differentiated between the two breeds, which might undergo recent selection. Furthermore, linkage disequilibrium between SNP-CNV pairs was generally low compared to SNP-SNP pairs.
Chapter 3 describes an improved SV catalogue exploiting deeply sequenced bovine genomes. This catalogue contains many small SVs undiscovered in the previous catalogue (chapter 2), many of which have sequence resolved breakpoints. Using a direct genotyping approach, I confirmed that nearly 80% of the SVs were present in an independent cohort of animals. Using sequenced level variants (SNPs and SVs), I showed that most SVs have tagging SNPs; however, findings were discrepant when using SNP-array data (50K SNPs and directly genotyped CNVs). This finding indicated that the variation arising from CNVs might not be fully captured based on SNP array data alone. Finally, I investigated high-impact SVs and mapped two SV-eQTL which alter gene expression.
In Chapter 4, I dissected a major QTL for clinical mastitis located on chromosome 6. By fine-mapping this region, I discovered the lead variants downstream of the GC gene within a 12-kb CNV. This CNV encompasses the 3’ alternative exon of the GC gene. By exploiting the pedigree structure in the data set, I delineated the multi-allelic nature of the CNV, of which the multiplicated allele underwent recent positive selection. The liver eQTL mapping results showed that the CNV is a lead variant for GC expression at both gene and transcript levels.
See also these dissertations


The role of service plants in promoting biological pest control and pollination in Xinjiang pear


Wild meat in the city, health risks and implications


Developing Breathomics for Clinical Application


Pharmacological inhibition of ketohexokinase in inborn and acquired metabolic disorders


Enhancing antimicrobial stewardship in veterinary medicine


Identifying Sound Features from Brain Activity


Microbubble Oscillations and Microstreaming
We print for the following universities














