Browsing by Author "Speidel, Scott, advisor"

Now showing 1 - 4 of 4

Open Access
Developing a strategy for identifying genetically important animals
(Colorado State University. Libraries, 2023) Wilson, Carrie S., author; Speidel, Scott, advisor; Enns, R. Mark, advisor; Lewis, Ronald, committee member; Mason, Esten, committee member
Livestock researchers often need to sample animals within a breed to serve as a representative sample of the breed. Identifying the most relevant animals to include in research for genotyping, building a reference population, or inclusion in a gene bank is a complex issue. A suboptimal sampling strategy can lead to biased results, the need for additional sampling, and can be costly. When using public funds (e.g., federal grant or federal appropriations) or member fees (e.g., breed association funds), we have a responsibility to efficiently spend these investments in a wise manner, optimizing which animals are sampled before the research, genotyping, or gene banking begins. The first objective was to develop a sampling strategy to maximize the genetic diversity captured for the sampled animals. Simulated data is ideal for this type of study as there is no limitation to the testing parameters. The primary benefit of simulation with this research was the opportunity to have known genotypes for every animal in the population. Since genotypes will almost never be available for the entire population in the real world, and identifying animals to genotype may in fact be the purpose of the sampling, pedigree-based sampling methods were chosen. Sampling methods tested included optimal contribution selection (OCS) and the genetic conservation index (GCI). The OCS selects parents based on constraining their co-ancestry rather than minimizing inbreeding. GCI seeks to maximize the number of founders in an animal's pedigree. The sampling strategy developed in Objective 1 was used to identify a subset of 100, 50, and 25 animals from each breed and the genetic diversity captured by each sampling method was assessed using both quantitative and molecular methods. AlphaSimR was used to simulate the population for sampling. After an initial randomly mating founder population was developed, an additional 15 years of selection for phenotypic weaning weight was simulated and resulted in a fully genotyped population with 13,662 animals per year. The simulation was designed to represent a sheep population. After the sampling strategies were applied to the simulated population, they were next applied to Suffolk sheep and Simmental beef populations for further assessment of their ability to capture genetic diversity. To assess population structure based on molecular data, the Suffolk and Simmental populations were limited to genotyped animals and their ancestors. The simulated population represented a large purebred population (n=204,930) with a moderate number of markers (n=53,901). The Suffolk population represented a small population (n=1,565) with many markers (n=606,006). Lastly, the Simmental population represented a large, admixed population (n=54,790) with a moderate number of markers (n=29,449). For the second objective, the population structure of the full populations, comprised of genotyped animals, was assessed, and compared to the population structure of the animals from each sampling strategy. Each sampling strategy selected 100, 50, and 25 animals. The measure of success of capturing the genetic diversity of the population was a molecular-based measure defined by capturing the available alleles in the population. Other population structure measures included a comparison of a phenotypic trait, breeding values, inbreeding levels, heterozygosity, minor allele frequency (MAF) category classification, runs of homozygosity (ROH), Ne, and model-based population structure to visualize subpopulations. While both sampling strategies were effective at capturing the available alleles in the population, OCS was more successful than GCI when comparing the same sample size. Success of capturing alleles decreased as sample size decreased from 100 to 50 to 25. Overall, OCS with a sample of 100 animals (OCS 100) was the most successful at capturing the available alleles in the population, capturing 96.5, 99.3, and 99.9 percent of the alleles for the simulated, Suffolk, and Simmental populations, respectively. For a sampling strategy to be useful, it needs to be effective across a variety of species and breeds with a variety of breed histories and population sizes. The third objective was to compare the three populations evaluated in this research and compare the effectiveness of the sampling strategies across these populations. Population structure was compared for the three populations. Then, the effectiveness of OCS 100 was compared. The three populations differed in population size and the amount of admixture present. The simulated population was characterized by a large number of low frequency alleles (n=5,339) that proved difficult to capture. The Suffolk population was small and consisted of 14 distinct subpopulations. The Simmental population had high levels of heterozygosity and less distinct subpopulation structure. Despite disparate populations, OCS 100 was the most robust across the three populations, consistently capturing the highest percentage of available alleles compared to the other sampling strategies. In summary, OCS 100 was the most effective sampling strategy across three different populations. A low-cost pedigree-based sampling strategy can be used to capture the genetic diversity in a population. Researchers will need to weigh the risk of a greater loss of alleles when selecting a smaller population size. Risk could be further reduced by increasing the selected population size. Knowledge of the prevalence of low frequency alleles in the population and the value of capturing them should be considered.
Open Access
Genetic selection for feed intake and efficiency in beef cattle
(Colorado State University. Libraries, 2019) Culbertson, Miranda M., author; Speidel, Scott, advisor; Enns, Mark, advisor; Thomas, Milt, committee member; Engle, Terry, committee member; Frasier, Marshall, committee member
To view the abstract, please see the full text of the document.
Open Access
Genetic selection for resistance to bovine respiratory disease using pooled DNA approaches
(Colorado State University. Libraries, 2025) Boldt, Ryan J., author; Enns, R. Mark, advisor; Speidel, Scott, advisor; Keele, John, committee member; Kuehn, Larry, committee member; McDaneld, Tara, committee member; Holt, Tim, committee member
Bovine Respiratory Disease (BRD) is the costliest disease that affects the beef cattle industry. However, the only methods that are currently available to reduce the incidence of the microbial organisms (viruses and bacteria) that cause BRD are vaccination and antibiotic treatment. Examples using other species and diseases have shown that the selection for resistance to disease is an effective method to reduce the economic burden of that disease on the industry. Due to the challenge of collection of phenotypes for a trait like BRD resistance, one of the best methods for selection could be genomic selection. To try and capture a representative sample of commercial genetic makeup of the beef industry, samples for the study were collected from a commercial harvest facilities. To reduce overall genotyping costs, samples were genotyped using a pooled DNA approach. While pooled DNA has been used previously to identify genomic regions that differentiate based on disease status, this has not been done for animals that showed symptoms for BRD during the post weaning period. Therefore, the objectives of this research were to, 1) examine different analysis techniques for pooled DNA information, and 2) identify across breed SNP that are significant for identifying animals more likely to develop clinical signs of BRD. To investigate the first objective of the dissertation, two separate analyses were done. The first analysis evaluated the number of SNPs used to calculate a genomic relationship matrix. While using DNA pooling does reduce the cost of genotyping by grouping samples, the cost could potentially be further reduced by using SNP chips with lower density. For the analysis, 106 pools comprised of 96 individuals each were genotyped using a high-density genomic panel that contained 777,962 SNP. To evaluate the use of lower density SNP chip on pooled DNA analyses, 50 replications of number of SNP from 500 to 770,000 were sampled randomly. For each level and replication, the resulting genomic relationship matrix was compared to the full relationship matrix calculated from 776,749 SNP, after individual SNP were removed for minor allele frequency <0.05. To calculate the equivalence of the matrices, the genomic relationship matrix calculated from the reduced number of SNP was multiplied by the Eigenvalues and Eigenvectors of the genomic relationship matrix formed from all SNP. After this multiplication, the variance of the Eigenvalues of the reduced matrix was standardized by the full matrix variance of the Eigenvalues of the resulting matrix was calculated. The closer the resulting variance is to 0 both matrices were considered to be proportional to one another. When examining the resulting Eigenvalues variances after 2,000 SNP the reduction of variance decreased in magnitude. These results suggest that a low-density panel may be used for pooled DNA data and for calculating genomic relationship matrices. The second analysis that was conducted to address the first objective looked at alternative analysis techniques for identification of simulated important SNP at varying levels of allelic prevalence and effect size. For the analysis, 100 random SNP across all chromosomes were selected to act as the significant SNP among the approximately 770,000 SNP available on the BovineHD chip. All SNP pooling allele frequencies (PAF) were simulated using a beta distribution. For the 100 significant SNP, the PAF were then modified based on differing levels of prevalence and the effect that the disease-causing SNP would have. For prevalence levels from 0.10 to 0.90, increments of 0.10 were simulated and for effect of the SNP values from 0.01 to 0.50 were simulated in increments of 0.01. For each of the 450 combinations of prevalence and effect, two different models were applied to the same dataset. The first model type was a GWAS analysis that has previously been applied to this data type. Under this model each SNP is tested via an F-test. The dependent variable for this analysis was the PAF and the fixed effect was a binary classification of if a pool was a case or a control. Additionally, a relationship matrix was calculated to account for any population stratification that was occurring in the simulated dataset. For each F-test, a p-value was calculated. The second type of analysis that was conducted was a Random Forrest analysis. For the Random Forrest the same number of trees, terminal node size, and number of explanatory variables to try at each node were applied to all combinations. The optimal number was determined to be 2,000 trees, a terminal node size of 1, and to try 60,000 explanatory variables. For each of the combinations the results were ranked based on lowest p-value and highest variable importance factor for the GWAS and Random Forrest analysis, respectively. From there, the top 100 most significant SNP were compared, and the number of pre-identified significant SNP were counted within the subset. Across all levels of prevalence each model was able to identify a subset of the most significant SNP. Across all levels of prevalence, the Random Forrest model started? identifying significant SNP at lower levels of effect of the disease-causing allele. Random Forest model started identifying significant SNP at lower levels of the disease-causing allele. At low (0.10, 0.20, 0.30) and high levels (0.70, 0.80, 0.90) prevalence levels the traditional GWAS model was able to identify a higher number of significant SNP at high effect levels. Whereas at moderate prevalence levels (0.40, 0.50, 0.60) the Random Forest model more correctly identified a larger number of the significant SNP. To address objective two, several analyses were run looking at estimating SNP effects to identify informative variants for selection against development of BRDC. For this analysis samples were collected from three large commercial processing plants in Colorado and Nebraska. DNA samples were collected from ears when the animals were harvested. Samples for the study were collected over a four-year period. For pooling, punches were removed from each ear, and animals were sorted into either a case or control pool. Within each individual pool 96 animals were represented. For each case a corresponding control from the same group from the feedlot was also collected. In total 106 pools were constructed representing 10,176 animals across all pools with a matching case and control strategy. DNA was extracted using a Quigen Kit and pools were sent to Neogen (Lincoln, NE) for genotyping on a Bovine SNP chip that contained approximately 770,000 individual SNP. For each SNP and each pool, a PAF was calculated. To account for population stratification in the analysis a covariance matrix among pools, PAF was calculated. Mixed model methodology was used to solve for effects in the model. In the first analysis, each individual SNP was examined. For each individual SNP an F-test was performed to test for significance. Additionally, analyses were performed using SNP groups. SNP groups were formed using 100, 500, and 1,000 SNP regions. For each region a distance matrix based on the PAF for SNPs in the region was calculated. This was then used as a response variable for an ANOVA analysis. Fixed effects were the A matrix to account for population stratification as well as 2 x 106 matrix to signify if an animal was either in a case or control pool. For all analysis types, no significant SNP were discovered. Additionally, several regions that have been previously reported to be significantly associated with BRDC in previous studies were also examined. To see if similar signal was being picked up, SNP were ranked from being estimated as the most significant to least significant and compared to previous results. Among the previously reported results there were regions on BTA16 (70-71), BTA16 (70-71), BTA14 (9-10), and BTA8 (63-64) that were among the top 1% of most significant SNP in the single SNP analyses. However, in the grouped SNP analyses none of these regions were in the top 1% of significant SNP. Other regions that have been previously identified in other papers were either not in the top 1% of SNP in any analysis or had p-values that were 0.85 or greater.
Open Access
Identifying single nucleotide polymorphisms associated with beef cattle terrain-use in the western United States
(Colorado State University. Libraries, 2019) Pierce, Courtney F., author; Thomas, Milton, advisor; Speidel, Scott, advisor; Coleman, Stephen, committee member; Enns, R. Mark, committee member; Meiman, Paul, committee member
Beef cattle are drawn to areas with gentle terrain, which may result in heavy grazing near riparian zones and minimal grazing on rugged terrain. Traditional management tools to improve grazing distribution can be costly; therefore, genomic selection has been proposed as a means of improving beef cattle grazing patterns. The objective of this thesis was to identify single nucleotide polymorphisms (SNP) associated with beef cattle terrain-use in the western U.S. Variant detection using RNA-sequencing data obtained from Angus cardiovascular tissues and Brangus reproductive tissues revealed 48 potential causative mutations in five genes that were previously associated with terrain-use indices: SDHAF3, RUSC2, SUPT20H, MAML3, and GRM5. In an additional study, Bayesian multiple-regression was performed using BovineHD genotypes and global positioning system (GPS) data collected from 80 beef cows managed in Arizona, Montana, and New Mexico. Results of this analysis suggested that beef cattle terrain-use was polygenic; however, additional observations were needed to validate the quantitative trait loci (QTL) identified. Subsequent genome-wide association studies (GWAS) were performed for six terrain-use traits using BovineSNP50 genotypes and distribution data collected from a multi-breed population of cattle (n = 330) managed in the western U.S. These analyses identified 32 QTL and 29 putative candidate genes with diverse functions related to hypoxia, heat stress, feed efficiency, weight traits, energy metabolism, and lactation. In conclusion, results presented in this thesis suggested that terrain-use is polygenic and may be improved with genetic selection; however, additional studies are needed to further elucidate the genetic mechanisms underlying terrain-use of beef cattle.