Browsing by Author "Cooley, Daniel, committee member"
Now showing 1 - 18 of 18
- Results Per Page
- Sort Options
Item Open Access Alpine wind speed and blowing snow trend identification and analysis(Colorado State University. Libraries, 2012) Fuller, Jamie D., author; Laituri, Melinda, advisor; Cooley, Daniel, committee member; Doesken, Nolan, committee member; Elder, Kevin, committee memberThe substantial quantity of climate change related analyses has resulted in increased research efforts concerning temporal wind speed trends. A change in wind speeds over time could have a widespread effect on snow transport and distribution in alpine regions. Since alpine meteorological stations are sparsely distributed, the intentions of this research were to explore North American Regional Reanalysis (NARR) to assess long-term trends of atmospheric conditions affecting snow transport with greater spatial coverage. NARR is a consistent, continuous and long-term dataset spanning the extent of North America at a spatial resolution of 32 km2 grids. NARR data were compared to two alpine sites (Niwot Ridge, Colorado and Glacier Lakes Ecological Experiments Station, Wyoming) from1989 to 2009. Multiple analyses were conducted to evaluate dataset agreement and temporal trends of alpine climatic conditions at the annual, seasonal and daily scales. The correlation of temperature, precipitation and wind speed between NARR and alpine in situ datasets showed temperature data as correlated, but wind and precipitation lacked agreement. NARR wind speed data were systematically lower when compared to observational data for both locations, but the frequency of wind events was captured. Thus, to more accurately assess blowing snow dynamics using NARR additional methods would be needed to relate the lower wind speed values to the extent of blowing snow. Trend analyses of wind speed datasets for each temporal scale (annual, seasonal and daily) showed slight trends, minimal significance and trends were not significantly different between NARR and in situ data. The statistical similarities were observed for trends with opposite signatures and slopes and a result of weak trends. Additional blowing snow analyses were conducted using temperature, wind speed and precipitation to estimate probable blowing snow events. The low agreement between NARR and observational data for wind speed and precipitation parameters prohibited the use of NARR to assess blowing snow processes and expand spatial and temporal coverage.Item Open Access Applications of least squares penalized spline density estimator(Colorado State University. Libraries, 2024) Jing, Hanxiao, author; Meyer, Mary, advisor; Cooley, Daniel, committee member; Kokoszka, Piotr, committee member; Berger, Joshua, committee memberThe spline-based method stands as one of the most common nonparametric approaches. The work in this dissertation explores three applications of the least squares penalized spline density estimator. Firstly, we present a novel hypothesis test against the unimodality of density functions, based on unimodal and bimodal estimates of the density function, using penalized splines. The test statistic is the difference in the least-squares criterion, between these fits. The distribution of the test statistics under the null hypothesis is estimated via simulated data sets from the unimodal fit. Large sample theory is derived and simulation studies are conducted to compare its performance with other common methods across various scenarios, alongside a real-world application involving neuro-transmission data from guinea pig brains. Secondly, we tackle the deconvolution density estimation problem, introducing the penalized splines deconvolution estimator. Building upon the results gained from piecewise constant splines, we achieve a cube-root convergence rate for piecewise quadratic splines and uniform errors. Moreover, we derive large sample theories for the penalized spline estimator and the constrained spline estimator. Simulation studies illustrate the competitive performance of our estimators compared to the kernel estimators across diverse scenarios. Lastly, drawing inspiration from the preceding applications, we develop a hypothesis test to discern whether the underlying density is unimodal or multimodal, given data with measurement error. Under the assumption of uniform errors, we introduce the test and derive the test statistic. Simulations are conducted to show the performance of the proposed test under different conditions.Item Open Access Characterizing the influence of anthropogenic emissions and transport variability on sulfate aerosol concentrations at Mauna Loa Observatory(Colorado State University. Libraries, 2013) Potter, Lauren E., author; Kreidenweis, Sonia, advisor; Maloney, Eric, committee member; Farmer, Delphine, committee member; Cooley, Daniel, committee memberSulfate aerosol in the atmosphere has substantial impacts on human health and environmental quality. Most notably, atmospheric sulfate has the potential to modify the earth's climate system through both direct and indirect radiative forcing mechanisms (Meehl et al., 2007). Emissions of sulfur dioxide, the primary precursor of sulfate aerosol, are now globally dominated by anthropogenic sources as a result of widespread fossil fuel combustion. Economic development in Asian countries since 1990 has contributed considerably to atmospheric sulfur loading, particularly China, which currently emits approximately 1/3 of global anthropogenic SO2 (Klimont et al., 2013). Observational and modeling studies have confirmed that anthropogenic pollutants from Asian sources can be transported long distances with important implications for future air quality and global climate change. Located in the remote Pacific Ocean (19.54°N, 155.58°W) at an elevation of 3.4 kilometers above sea level, Mauna Loa Observatory (MLO) is an ideal measurement site for ground-based, free tropospheric observations and is well situated to experience influence from springtime Asian outflow. This study makes use of a 14-year data set of aerosol ionic composition, obtained at MLO by the University of Hawaii at Manoa. Daily filter samples of total aerosol concentrations were made during nighttime downslope (free-tropospheric) transport conditions, from 1995 to 2008, and were analyzed for aerosol-phase concentrations of the following species: nitrate (NO3-), sulfate (SO42-), methanesulfonate (MSA), chloride (Cl-), oxalate, sodium (Na+), ammonium (NH4+), potassium (K+), magnesium (Mg2+), and calcium (Ca2+). An understanding of the factors controlling seasonal and interannual variations in aerosol speciation and concentrations at this site is complicated by the relatively short lifetimes of aerosols, compared with greenhouse gases which have also been sampled over long time periods at MLO. Aerosol filter data were supplemented with observations of gaseous radon (Rn222) and carbon monoxide (CO), used as tracers of long distance continental influence. Our study applied trajectory analysis and multiple linear regression to interpret the relative roles of aerosol precursor emissions and large-scale transport characteristics on observed MLO sulfate aerosol variability. We conclude that observed sulfate aerosol at MLO likely originated from a combination of anthropogenic, volcanic, and biogenic sources that varied seasonally and from year to year. Analysis of chemical continental tracer concentrations and HYSPLIT back trajectories suggests that non-negligible long distance influence from either the Asian or North American continents can be detected at MLO during all seasons although large interannual variability was observed. Possible influence of circulation changes in the Pacific Basin related to the El Niño-Southern Oscillation were found to be both species and seasonally dependent. We further found an increasing trend in monthly mean sulfate aerosol concentrations at MLO of 4.8% (7.3 ng m-3) per year during 1995-2008, significant at the 95% confidence level. Multiple linear regression results suggest that the observed trend in sulfate concentrations at MLO cannot reasonably be explained by variations in meteorology and transport efficiency alone. An increasing sulfate trend of 5.8 ng m-3 per year, statistically significant at the 90% confidence level, was found to be associated with the variable representing East Asian SO2 emissions. The results of this study provide evidence that MLO sulfate aerosol observations during 1995-2008 reflect, in part, recent trends in anthropogenic SO2 emissions which are superimposed onto the natural meteorological variability affecting transport efficiency.Item Open Access CSU-MLP GEFS day-1 "first-guess" excessive rainfall forecasts: aggregate evaluation and synoptic regimes of best- and worst-performing forecasts(Colorado State University. Libraries, 2022) Escobedo, Jacob A., author; Schumacher, Russ, advisor; van den Heever, Susan, committee member; Cooley, Daniel, committee memberForecasting excessive rainfall, particularly flash flood-producing rainfall, is an important problem that remains difficult due to the small spatial scales and varying temporal scales at which they occur. One important operational product that highlights areas for potential excessive rainfall and flash flood occurrences is the Excessive Rainfall Outlook (ERO) issued by the NOAA Weather Prediction Center (WPC), which provides outlooks for lead times of 1-3 days. To address the need for additional tools for WPC forecasters while forming a given ERO, the Colorado State University Machine Learning Probabilities (CSU-MLP) system, a probabilistic forecast system for excessive rainfall (and other convective hazards), was developed to produce forecasts to be used as a "first-guess" ERO. CSU-MLP employs the use of a random forest (RF) algorithm trained using NOAA's Second-Generation Global Ensemble Forecast System Reforecast (GEFS/R) and precipitation observations, while using the operational GEFS with the trained model to produce real-time forecasts. Initially developed as a medium range guidance (2-3 day lead time), CSU-MLP has produced day-1 forecasts that have been evaluated favorably during the 4-week Flash Flood and Intense Rainfall Experiment (FFaIR) in the summer of 2020. However, CSU-MLP day-1 forecasts have been observed to have daily forecast skill that can vary widely between days. This work will include an aggregate evaluation of CSU-MLP day-1 forecasts over a longer period of study (3 March 2019 – 15 October 2020) than that of FFaIR, and an identification of synoptic regimes for which these forecasts tend to perform at their best and worst. Results show that CSU-MLP day-1 forecasts are reliable, provide adequate discrimination of excessive rainfall events (AuROC =0.819), and have comparable performance, evaluated by use of the Brier skill score (BSS), to that of the ERO (CSU-MLP BSS = 0.081; ERO BSS = 0.085). However, CSU-MLP forecasts have a higher frequency of categorical probabilities (≥ 0.05) which results in larger variations of daily BSS. Synoptic regimes of best-performing daily forecasts reveal a tendency for these regimes to be characterized by moderate to strong large-scale forcing and relatively high low-level and column moisture. This would include warm-season regimes with moderate amplitude upper-level troughs, tropical cyclones, cut-off lows, and cool-season regimes where strong forcing is co-located near an abundant moisture source. Forecasts tend to perform worst when there is strong large-scale forcing and low-level and column moisture is relatively low, such as cool-season regimes with large amplitude troughs and surface cyclones but higher levels of atmospheric moisture are not present nor as widespread. This work has implications for WPC forecasters as they use the "first-guess" forecasts while developing the ERO for a given day, as well as implications for future CSU-MLP system model iterations and/or designs.Item Open Access Estimating the likelihood of significant climate change with the NCAR 40-member ensemble(Colorado State University. Libraries, 2014) Foust, William Eliott, author; Thompson, David, advisor; Randall, David, committee member; Barnes, Elizabeth, committee member; Cooley, Daniel, committee memberIncreasing greenhouse gas concentrations are changing the radiative forcing on the climate system, and this forcing will be the key driver of climate change over the 21st century. One of the most pressing questions associated with climate change is whether certain aspects of the climate system will change significantly. Climate ensembles are often used to estimate the probability of significant climate change, but they struggle to produce accurate estimates of significant climate change because they sometimes require more realizations than what is feasible to produce. Additionally, the ensemble mean suggests how the climate will respond to an external forcing, but since it filters out the variability, it cannot determine if the response is significant. In this study, the NCAR CCSM 40-member ensemble and a lag-1 autoregressive model (AR1 model) are used to estimate the likelihood that climate trends will be significant. The AR1 model generates an analytic solution for what the distribution of trends should be if the NCAR model was run an infinite number of times. The analytical solution produced by the AR1 model is used to assess the significance of future climate trends. The results of this study demonstrate that an AR1 model can aid in making a probabilistic forecast. Additionally, the results give insight into the certainty of the trends in the surface temperature field, precipitation field, and atmospheric circulation, the probability of climate trends being significant, and whether the significance of climate trends is dependent on the internal variability or anthropogenic forcing.Item Open Access Evaluation of OCO-2 small-scale XCO2 variability using lidar retrievals from the ACT-America flight campaign(Colorado State University. Libraries, 2018) Bell, Emily, author; Kummerow, Christian, advisor; O'Dell, Christopher, advisor; Denning, Scott, committee member; Cooley, Daniel, committee memberWith eight 1.25 x 3 kilometer footprints across its swath and nearly 1 million observations of column-mean carbon dioxide concentration (XCO2) per day, the Orbiting Carbon Observatory (OCO-2) presents exciting possibilities for monitoring the global carbon cycle, including the detection of small-scale column CO2 variations. While the global OCO-2 dataset has been shown to be quite robust, and case studies have shown successful observation of CO2 plumes from power plants and cities, the validation of XCO2 gradients on small spatial scales remains challenging: ground-based measurements, while extremely precise, are sparsely scattered and often geographically stationary. In this work, we investigate the use of an integrated path differential absorption (IPDA) lidar as a source for OCO-2 small-scale validation. As part of NASA's ACT-America project, several campaigns over North America have included a number of direct underflights of OCO-2 tracks with the Multi-Functional Fiber Laser Lidar (MFLL), as well as a set of in situ instruments, to provide a precisely collocated, high-resolution validation dataset. We explore the challenges involved in comparing the MFLL and OCO-2 datasets, from instrument principles to retrieval differences, and develop a method of correcting for some of these differences. After nine underflights, a combination of lidar data and a novel in situ-derived CO2 "curtain" have helped us to identify systematic spurious small-scale features in the OCO-2 dataset due to both surface and cloud effects. We show that though real XCO2 features on scales of tens of kilometers remain challenging to observe and validate, the lidar and OCO-2 generally have comparable spatial gradients on synoptic scales.Item Open Access Heavy tail analysis for functional and internet anomaly data(Colorado State University. Libraries, 2021) Kim, Mihyun, author; Kokoszka, Piotr, advisor; Cooley, Daniel, committee member; Meyer, Mary, committee member; Pinaud, Olivier, committee memberThis dissertation is concerned with the asymptotic theory of statistical tools used in extreme value analysis of functional data and internet anomaly data. More specifically, we study four problems associated with analyzing the tail behavior of functional principal component scores in functional data and interarrival times of internet traffic anomalies, which are available only with a round-off error. The first problem we consider is the estimation of the tail index of scores in functional data. We employ the Hill estimator for the tail index estimation and derive conditions under which the Hill estimator computed from the sample scores is consistent for the tail index of the unobservable population scores. The second problem studies the dependence between extremal values of functional scores using the extremal dependence measure (EDM). After extending the EDM defined for positive bivariate observations to multivariate observations, we study conditions guaranteeing that a suitable estimator of the EDM based on these scores converges to the population EDM and is asymptotically normal. The third and last problems investigate the asymptotic and finite sample behavior of the Hill estimator applied to heavy-tailed data contaminated by errors. For the third one, we show that for time series models often used in practice, whose non–contaminated marginal distributions are regularly varying, the Hill estimator is consistent. For the last one, we formulate conditions on the errors under which the Hill and Harmonic Moment estimators applied to i.i.d. data continue to be asymptotically normal. The results of large and finite sample investigations are applied to internet anomaly data.Item Open Access Interactions between the Madden-Julian oscillation and mesoscale to global scale phenomena(Colorado State University. Libraries, 2019) Toms, Benjamin A., author; van den Heever, Susan C., advisor; Barnes, Elizabeth A., committee member; Maloney, Eric D., committee member; Cooley, Daniel, committee memberThe Madden-Julian Oscillation (MJO) influences and interacts with atmospheric phenomena across the globe, from the tropics to the poles. In this two-part study, the interactions of the MJO with other phenomena across a broad range of scales are considered, including mesoscale convective structures within the tropics and global teleconnection patterns. While the two studies are distinct in the scales of the interactions they discuss, each highlights an aspect of the importance of interactions between the MJO and variability across a broad range of scales within the climate system. The study of such cross-scale interactions is important for understanding our climate system, as these interactions can transfer energy between phenomena of starkly different spatial and temporal scales. Part one of the study uses a cloud-resolving model, the Regional Atmospheric Modeling System, to consider the relationship between mesoscale convective structures within the Indo-Pacific region and the regional, intraseasonal anomalies associated with the MJO. The simulation captures the entirety of a canonical boreal summertime MJO event, spanning 45 days in July and August of 2016, during which the convective anomaly associated with the MJO propagated over the Maritime Continent. The convective cloud structures, or cells, within the simulation were tracked and logged according to their location relative to the regional convective anomaly of the MJO. Using both spectral analysis and phase compositing, it was found that a progressive relationship exists between the boreal summertime MJO and mesoscale deep convective structures within the Indo-Pacific region, specifically within the convectively enhanced region of the MJO, as follows: increased cell longevity in the initial phases of the MJO, followed by increased cell number in the intermediate phases, progressing into increased cell expanse in the terminal phases. This progressive relationship is connected back to the low-frequency atmospheric response to the MJO. It is suggested that the bulk thermodynamic and kinematic anomalies of the MJO are closely related to the convective cell expanse and longevity, although the number of convective cells appears to be tied to another source of variability not identified within this study. These findings emphasize that while the MJO is commonly defined as an intraseasonal-scale convective anomaly, it is also intrinsically tied to the mesoscale variability of the convective systems that constitute its existence. The second part of the study quantifies the prevalence of the MJO within the overall climate system, along with the dependence of its teleconnections on variability in another tropical phenomena on a larger scale than itself. It is well known that the MJO exhibits pronounced seasonality in its tropical and global signature, and recent research has suggested that its tropical structure also depends on the state of the Quasi-Biennial Oscillation (QBO). We therefore first quantify the relationship between 300-mb geopotential anomalies and the MJO across the globe, then test the dependence of the relationship on both the meteorological season and the QBO phase using a derivative of cross-spectral analysis, magnitude-squared coherence Coh2. It is found that the global upper-tropospheric signature of the MJO exhibits pronounced seasonality, but also that the QBO significantly modulates the upper-tropospheric tropical and extratropical anomalies associated with the MJO. Globally, variability in upper tropospheric geopotential linked to the MJO is maximized during the boreal summertime and wintertime of easterly QBO phases, which is consistent with previous research that has shown easterly QBO phases to enhance the persistence of tropical convection associated with the MJO. Additional features are identified, such as the global maximum in upper-tropospheric variability associated with the MJO occurring during boreal summertime, rather than boreal wintertime. Overall, the MJO explains seven to thirteen percent of intraseasonal atmospheric variability in 300-mb geopotential, depending on season and QBO phase. These results highlight the importance of considering the phase of the QBO in analyses related to either global or local impacts of the MJO, along with the importance of cross-scale relationships, such as those between the MJO and QBO, in governing the coupling between the MJO and teleconnections across the globe. This thesis considers the relationship between the MJO and processes that operate on both longer and shorter timescales than itself, including tropical convection and the Quasi-Biennial Oscillation. In doing so, this work highlights the importance of considering relationships between the MJO and atmospheric phenomena on different spatial and temporal scales and with origins distinct from the MJO itself. While theories exist describing the MJO as its own distinct entity, this research corroborates the idea that it is at its core fundamentally linked to the rest of the climate system, both modulating and being modulated by a broad range of atmospheric processes.Item Open Access Methodology in air pollution epidemiology for large-scale exposure prediction and environmental trials with non-compliance(Colorado State University. Libraries, 2023) Ryder, Nathan, author; Keller, Kayleigh, advisor; Wilson, Ander, committee member; Cooley, Daniel, committee member; Neophytou, Andreas, committee memberExposure to airborne pollutants, both long- and short-term, can lead to harmful respiratory, cardiovascular, and cardiometabolic outcomes. Multiple challenges arise in the study of relationships between ambient air pollution and health outcomes. For example, in large observational cohort studies, individual measurements are not feasible so researchers use small sets of pollutant concentration measurements to predict subject-level exposures. As a second example, inconsistent compliance of subjects to their assigned treatments can affect results from randomized controlled trials of environmental interventions. In this dissertation, we present methods to address these challenges. We develop a penalized regression model that can predict particulate matter exposures in space and time, including penalties to discourage overfitting and encourage smoothness in time. This model is more accurate than spatial-only and spatiotemporal universal kriging (UK) models when the exposures are missing in a regular (semi-daily) pattern. Our penalized regression model is also faster than both UK models, allowing the use of bootstrap methods to account for measurement error bias and monitor site selection in a two-stage health model. We introduce methods to estimate causal effects in a longitudinal setting by latent "at-the-time" principal strata. We implement an array of linear mixed models on data subsets, each with weights derived from principal scores. In addition, we estimate the same stratified causal effects with a Bayesian mixture model. The weighted linear mixed models outperform the Bayesian mixture model and an existing single-measure principal scores method in all simulation scenarios, and are the only method to produce a significant estimate for a causal effect of treatment assignment by strata when applied to a Honduran cookstove intervention study. Finally, we extend the "at-the-time" longitudinal principal stratification framework to a setting where continuous exposure measurements are the post-treatment variable by which the latent strata are defined. We categorize the continuous exposures to a binary variable in order to use our previous method of weighted linear mixed models. We also extend an existing Bayesian approach to the longitudinal setting, which does not require categorization of the exposures. The previous weighted linear mixed model and single-measure principal scores methods are negatively biased when applied to simulated samples, while the Bayesian approach produces the lowest RMSE and bias near zero. The Bayesian approach, when applied to the same Honduran cookstove intervention study as before, does not find a significant estimate for the causal effect of treatment assignment by strata.Item Open Access Methods for extremes of functional data(Colorado State University. Libraries, 2018) Xiong, Qian, author; Kokoszka, Piotr S., advisor; Cooley, Daniel, committee member; Pinaud, Olivier, committee member; Wang, Haonan, committee memberMotivated by the problem of extreme behavior of functional data, we develop statistical theory at the nexus of functional data analysis (FDA) and extreme value theory (EVT). A fundamental technique of functional data analysis is to replace infinite dimensional curves with finite dimensional representations in terms of functional principal components (FPCs). The coefficients of these projections, called the scores, encode the shapes of the curves. Therefore, the study of the extreme behavior of functional time series can be transformed to the study on functional principal component scores. We first derive two tests of significance of the slope function using functional principal components and their empirical counterparts (EFPC's). Applied to tropical storm data, these tests show a significant trend in the annual pattern of upper wind speed levels of hurricanes. Then we establish sufficient conditions under which the asymptotic extreme behavior of the multivariate estimated scores is the same as that of the population scores. We clarify these issues, including the rate of convergence, for Gaussian functions and for more general functional time series whose projections are in the Gumbel domain of attraction. Finally, we derive the asymptotic distribution of the sample covariance operator and of the sample functional principal components for functions which are regularly varying and whose fourth moment does not exist. The new theory is applied to establish the consistency of the regression operator in a functional linear model, with such errors.Item Open Access Model post-processing for the extremes: improving forecasts of locally extreme rainfall(Colorado State University. Libraries, 2016) Herman, Gregory Reid, author; Schumacher, Russ, advisor; Barnes, Elizabeth, committee member; Cooley, Daniel, committee memberThis study investigates the science of forecasting locally extreme precipitation events over the contiguous United States from a fixed-frequency perspective, as opposed to the traditionally applied fixed-quantity forecasting perspective. Frequencies are expressed in return periods, or recurrence intervals; return periods between 1-year and 100-years are analyzed for this study. Many different precipitation accumulation intervals may be considered in this perspective; this research chooses to focus on 6- and 24-hour precipitation accumulations. The research presented herein discusses the beginnings of a comprehensive forecast system to probabilistically predict extreme precipitation events using a vast suite of dynamical numerical weather prediction model guidance. First, a recent climatology of extreme precipitation events is generated using the aforementioned fixed-frequency framework. The climatology created generally conforms with previous extreme precipitation climatologies over the US, with predominantly warm season events east of the continental divide, especially to the north away from major bodies of water, and primarily cool-season events along the Pacific coast. The performance of several operational and quasi-operational models of varying dynamical cores and model resolutions are assessed with respect to their extreme precipitation characteristics; different biases are observed in different modeling systems, with one model dramatically overestimating extreme precipitation occurrences across the entire US, while another coarser model fails to produce the vast majority of the rarest (50-100+ year) events, especially to the east of the Rockies where most extreme precipitation events are found to be convective in nature. Some models with a longer available record of model data are employed to develop model-specific quantitative precipitation climatologies by parametrically fitting right-skewed distributions to model precipitation data, and applying these fitted climatologies for extreme precipitation forecasting. Lastly, guidance from numerous models is examined and used to generate probabilistic forecasts for locally extreme rainfall events. Numerous methods, from the simple to the complex, are explored for generating forecast probabilities; it is found that more sophisticated methods of generating forecast probabilities from an ensemble of models can significantly improve forecast quality in every metric examined when compared with the most traditional probabilistic forecasting approach. The research concludes with the application of the forecast system to a recent extreme rainfall outbreak which impacted several regions of the United States.Item Open Access New methods for fixed-margin binary matrix sampling, Fréchet covariance, and MANOVA tests for random objects in multiple metric spaces(Colorado State University. Libraries, 2022) Fout, Alex M., author; Fosdick, Bailey, advisor; Kaplan, Andee, committee member; Cooley, Daniel, committee member; Adams, Henry, committee memberMany approaches to the analysis of network data essentially view the data as Euclidean and apply standard multivariate techniques. In this dissertation, we refrain from this approach, exploring two alternate approaches to the analysis of networks and other structured data. The first approach seeks to determine how unique an observed simple, directed network is by comparing it to like networks which share its degree distribution. Generating networks for comparison requires sampling from the space of all binary matrices with the prescribed row and column margins, since enumeration of all such matrices is often infeasible for even moderately sized networks with 20-50 nodes. We propose two new sampling methods for this problem. First, we extend two Markov chain Monte Carlo methods to sample from the space non-uniformly, allowing flexibility in the case that some networks are more likely than others. We show that non-uniform sampling could impede the MCMC process, but in certain special cases is still valid. Critically, we illustrate the differential conclusions that could be drawn from uniform vs. nonuniform sampling. Second, we develop a generalized divide and conquer approach which recursively divides matrices into smaller subproblems which are much easier to count and sample. Each division step reveals interesting mathematics involving the enumeration of integer partitions and points in convex lattice polytopes. The second broad approach we explore is comparing random objects in metric spaces lacking a coordinate system. Traditional definitions of the mean and variance no longer apply, and standard statistical tests have needed reconceptualization in terms of only distances in the metric space. We consider the multivariate setting where random objects exist in multiple metric spaces, which can be thought of as distinct views of the random object. We define the notion of Fréchet covariance to measure dependence between two metric spaces, and establish consistency for the sample estimator. We then propose several tests for differences in means and covariance matrices among two or more groups in multiple metric spaces, and compare their performance on scenarios involving random probability distributions and networks with node covariates.Item Open Access Quantifying internal climate variability and its changes using large-ensembles of climate change simulations(Colorado State University. Libraries, 2020) Li, Jingyuan, author; Thompson, David W. J., advisor; Barnes, Elizabeth A., committee member; Ravishankara, A. R., committee member; Cooley, Daniel, committee memberIncreasing temperatures over the last 50 years have led to a multitude of studies on observed and future impacts on surface climate. However, any changes on the mean need to be placed in the context of its variability to be understood and quantified. This allows us to: 1) understand the relative impact of the mean change on the subsequent environment, and 2) detect and attribute the external change from the underlying "noise" of internal variability. One way to quantify internal variability is through the use of large ensemble models. Each ensemble member is run on the same model and with the same external forcings, but with slight differences in the initial conditions. Differences between ensemble members are due solely to internal variability. This research exploits one such large ensemble of climate change simulations (CESM-LE) to better understand and evaluate surface temperature variability and its effects under external forcing. One large contribution to monthly and annual surface temperature variability is the atmospheric circulation, especially in the extratropics. Dynamical adjustment seeks to determine and remove the effects of circulation on temperature variability in order to narrow the range of uncertainty in the temperature response. The first part of this work compares several commonly used dynamical adjustment methods in both a pre-industrial control run and the CESM-LE. Because there are no external forcings in the control run, it is used to provide a quantitative metric by which the methods are evaluated. We compare and assess these dynamical adjustment methods on the basis of 2 attributes: 1) the method should remove a maximum amount of internal variability while 2) preserving the true forced signal. While the control run is excellent for assessing the methods in an "ideal" environment, results from the CESM-LE show biases in the dynamically-adjusted trends due to a forced response in the circulation fields themselves. This work provides a template from which to assess the various dynamical adjustment methods available to the community. A less studied question is how internal variability itself will respond to climate change. Past studies have found regional changes in surface temperature variance and skewness. This research also investigates the impacts of climate change on day-to-day persistence of surface temperature. Results from the CESM-LE suggest that external warming generally increases surface temperature persistence, with the largest changes over the Arctic and ocean regions. The results are robust and distinct from internal variability. We suggest that persistence changes are mostly due to an increase in the optical thickness of the atmosphere due to increases in both carbon dioxide and water vapor. This increased optical thickness reduces the thermal damping of surface temperatures, increasing their persistence. Model results from idealized aquaplanet simulations with different radiation schemes support this hypothesis. The results thus reflect a robust thermodynamic and radiative constraint on surface temperature variability.Item Open Access Spatial probit models for multivariate ordinal data: computational efficiency and parameter identifiability(Colorado State University. Libraries, 2013) Schliep, Erin M., author; Hoeting, Jennifer, advisor; Cooley, Daniel, committee member; Lee, Myung Hee, committee member; Webb, Colleen, committee memberThe Colorado Natural Heritage Program (CNHP) at Colorado State University evaluates Colorado's rare and at-risk species and habitats and promotes conservation of biological resources. One of the goals of the program is to determine the condition of wetlands across the state of Colorado. The data collected are measurements, or metrics, representing landscape condition, biotic condition, hydrologic condition, and physiochemical condition in river basins statewide. The metrics differ in variable type, including binary, ordinal, count, and continuous response data. It is common practice to uniformly discretize the metrics into ordinal values and combine them using a weighted-average to obtain a univariate measure of wetland condition. The weights assigned to each metric are based on best professional judgement. The motivation of this work was to improve on the user-defined weights by developing a statistical model to estimate the weights using observed data. The challenges of creating a model that fulfills this requirement are many. First, the observed data are multivariate and consist of different variable types which we wish to preserve. Second, the multivariate response data are not independent across river basin because wetlands at close proximity are correlated. Third, we want the model to provide a univariate measure of wetland condition that can be compared across the state. Lastly, it is of interest to the ecologists to predict the univariate measure of wetland condition at unobserved locations requiring covariate information to be incorporated into the model. We propose a multivariate multilevel latent variable model to address these challenges. Latent continuous response variables are used to model the different types of response variables. An additional latent variable, or common factor, is used as a univariate measure of wetland condition. The mean of the common factor contains observable covariate data in order to predict at unobserved locations. The variance of the common factor is defined by a spatial covariance function to account for the dependence between wetlands. The majority of the metrics reported by the CNHP are ordinal. Therefore, our primary focus is modeling multivariate ordinal response data where binary data is a special case. Probit linear models and probit linear mixed models are examples of models for ordinal response data. Probit models are attractive in that they can be defined in terms of latent variables. Computational efficiency is a major issue when fitting multivariate latent variable models in a Bayesian framework using Markov chain Monte Carlo (MCMC). There is also a high computation cost for running MCMC when fitting geostatistical spatial models. Data augmentation and parameter expansion are both modeling techniques that can lead to optimal iterative sampling algorithms for MCMC. Data augmentation allows for simpler and more feasible simulation from a posterior distribution. Parameter expansion is a method for accelerating convergence of iterative sample algorithms and can enhance data augmentation algorithms. We propose data augmentation and parameter-expanded data augmentation algorithms for fitting MCMC to spatial probit models for binary and ordinal response data. Parameter identifiability is another challenge when fitting multivariate latent variable models due to the multivariate response data, number of parameters, unobserved latent variables, and spatial random effects. We investigate parameter identifiability for the common factor model for multivariate ordinal response data. We extend the common factor model to include covariates and spatial correlation so we can predict wetland condition at unobserved locations. The partial sill and range parameter of a spatial covariance function are difficult to estimate because they are near-nonidentifiable. We propose a new parameterization for the covariance function of the spatial probit model that leads to better mixing and faster convergence of the MCMC. Whereas our spatial probit model for ordinal response data follows the common factor model approach, there are other forms of the spatial probit model. We give a comprehensive comparison of two types of spatial probit models, which we refer to as the first-stage and second-stage spatial probit model. We discuss the implications of fitting each model and compare them in terms of their impact on parameter estimation and prediction at unobserved locations. We propose a new approximation for predicting ordinal response data that is both accurate and efficient. We apply the multivariate multilevel latent variable model to data collected in the North Platte and Rio Grande River Basins to evaluate wetland condition. We obtain statistically derived weights for each of the response metrics with confidence limits. Lastly, we predict the univariate measure of wetland condition at unobserved locations.Item Open Access Symmetric functions, shifted tableaux, and a class of distinct Schur Q-functions(Colorado State University. Libraries, 2022) Salois, Kyle, author; Gillespie, Maria, advisor; Cavalieri, Renzo, committee member; Hulpke, Alexander, committee member; Cooley, Daniel, committee memberThe Schur Q-functions form a basis of the algebra Ω of symmetric functions generated by the odd-degree power sum basis pd, and have ramifications in the projective representations of the symmetric group. So, as with ordinary Schur functions, it is relevant to consider the equality of skew Schur Q-functions Qλ/μ. This has been studied in 2008 by Barekat and van Willigenburg in the case when the shifted skew shape λ/μ is a ribbon. Building on this premise, we examine the case of near-ribbon shapes, formed by adding one box to a ribbon skew shape. We particularly consider frayed ribbons, that is, the near-ribbons whose shifted skew shape is not an ordinary skew shape. We conjecture with evidence that all Schur Q-functions for frayed ribbon shapes are distinct up to antipodal reflection. We prove this conjecture for several infinite families of frayed ribbons, using a new approach via the "lattice walks'' version of the shifted Littlewood-Richardson rule, discovered in 2018 by Gillespie, Levinson, and Purbhoo.Item Open Access Testing and adjusting for informative sampling in survey data(Colorado State University. Libraries, 2014) Herndon, Wade Wilson, author; Breidt, F. Jay, advisor; Opsomer, Jean, advisor; Cooley, Daniel, committee member; Meyer, Mary, committee member; Doherty, Paul, committee memberFitting models to survey data can be problematic due to the potentially complex sampling mechanism through which the observed data are selected. Survey weights have traditionally been used to adjust for unequal inclusion probabilities under the design-based paradigm of inference, however, this limits the ability of analysts to make inference of a more general kind, such as to characteristics of a superpopulation. The problems induced by the presence of a complex sampling design can be generally contained under the heading of informative sampling. To say that the sampling is informative is to say that the distribution of the data in the sample is different from the distribution of the data in the population. Two major topics relating to analyzing survey data with (potentially) informative sampling are addressed: testing for informativeness, and model building in the presence of informative sampling. First addressed is the problem of running formal tests for informative sampling in survey data. The major contribution contained here is to detail a new test for informative sampling. The test is shown to be widely applicable and straight-forward to implement in practice, and also useful compared to existing tests. The test is illustrated through a variety of empirical studies as well. These applications include a censored regression problem, linear regression, logistic regression, and fitting a gamma mixture model. Results from the analogous bootstrap test are also presented; these results agree with the analytic versions of the test. Alternative tests for informative sampling do in fact exist, however, the existing methods each have significant drawbacks and limitations which may be resolved in some situation with this new methodology, and overall the literature is quite sparse in this area. In a simulation study, the test is shown to have many desirable properties and maintains high power compared to alternative tests. Also included is discussion about the limiting distribution of the test statistic under a sequence of local alternative hypotheses, and some extensions that are useful in connecting the work contained here with some of the previous work in the area. These extensions also help motivate the semiparametric methods considered in the chapter that follows. The next topic explored is semiparametric methods for including design information in a regression model while staying within a model-based inferential framework. The ideas explored here attempt to exploit relationships between design variables (such as the sample inclusion probabilities) and model covariates. In order to account for the complex sampling design and (potential) bias in estimating model parameters, design variables are included as covariates and considered to be functions of the model covariates that can then be estimated in a design-based paradigm using nonparametric methods. The nonparametric method explored here is kernel smoothing with degree zero. In principle, other (and more complex) kinds of estimators could be used to estimate the functions of the design variables conditional on the model covariates, but the framework presented here provides asymptotic results for only the more simple case of kernel smoothing. The method is illustrated via empirical applications and also through a simulation study in which confidence band coverage rates from the semiparametric method are compared to those obtained through regular linear regression. The semiparametric estimator soundly outperforms the regression estimator.Item Open Access Towards understanding the role of natural variability in climate change(Colorado State University. Libraries, 2017) Li, Jingyuan, author; Thompson, David W. J., advisor; Barnes, Elizabeth A., committee member; Cooley, Daniel, committee memberNatural variability plays a large role in determining surface climate on local and regional scales. Understanding the role of natural variability is crucial for accurately assessing and attributing climate trends, both past and future. One successful way to examine the role of natural variability in climate change has been through large ensembles of climate models. This thesis uses one such large ensemble (the NCAR CESM-LE) to test various methods used to quantify natural variability in the context of climate change. We first introduce a simple analytic expression for calculating the lead time required for a linear trend to emerge in a Gaussian first order autoregressive process. The expression is derived from the standard error of the regression and is tested using the CESM-LE. It is shown to provide a robust estimate of the point in time when the forced signal of climate change has emerged from the natural variability of the climate system with a predetermined level of statistical confidence. The expression provides a novel analytic tool for estimating the time of emergence of anthropogenic climate change and its associated regional climate impacts from either observed or modeled estimates of natural variability and trends. We next compare and analyze various methods for calculating the effects of internal circulation dynamics on surface temperature. Dynamical adjustment seeks to separate out dynamical contribution to temperature trends, thus reducing the amplitude of natural variability that obscures the signal of anthropogenic forcing. Three specific methods used in the climate literature are examined: principal component analysis (PCR), maximum covariance analysis (MCA), and constructed circulation analogs. An assessment of these methods are given with their respective results from the CESM control run and large ensemble.Item Open Access Variability in observed remote marine aerosol populations and implications for haze and cloud formation(Colorado State University. Libraries, 2020) Atwood, Samuel A., author; Kreidenweis, Sonia M., advisor; van den Heever, Susan C., committee member; Pierce, Jeffrey R., committee member; Cooley, Daniel, committee memberIn many oceanic regions of the planet, once pristine environments are known to have a high degree of sensitivity to changing aerosol populations and perturbations from anthropogenic emissions. However, difficulties in modeling and remote sensing efforts in remote marine regions have led to continued uncertainties in aerosol-cloud-climate interactions. Numerous properties of the aerosol and environment affect these interactions in complex and often non-linear ways. In this work, I examine the variability in observed remote marine aerosol properties and its implications for classifying aerosol impacts on cloud development and radiative transfer in the atmosphere. The results from several field campaigns that measured aerosol and environmental properties relevant to these processes in marine and coastal regions are first presented. An unsupervised classification methodology was used to identify periods of impacts associated with distinct fine-mode aerosol population types and to quantify the observed range of variability associated with these types. A specific focus was placed on differentiating between internal variability in relevant properties within a given population type and external variability between the average values for each population type. The result was a set of aerosol population type models observed in marine regions that allowed for further investigation of the impact of different sources of variability on subsequent atmospheric processes. Next presented are the results of several observationally driven sensitivity studies using the aerosol models. First, initial cloud properties were investigated using a cloud parcel model driven by the observed aerosol population types to examine relative sensitivity to updraft velocity, extensive aerosol properties including number concentration, and a range of intensive aerosol properties. It was found that the parameter space across which initial cloud property sensitivity to variability in the observed aerosol dataset was investigated could be simplified to incorporate relevant intensive aerosol properties into a single population type parameter. Previous work using simpler mono-modal aerosol populations had identified several regimes of sensitivity of initial cloud properties to updraft velocity and total particle number concentration. When driven by the more complex and atmospherically relevant marine population types additional sensitivity to population type was identified through portions of these two regimes, and a new regime was identified that was more sensitive to population type than either of the other parameters. A Monte Carlo optical reconstruction model was then used to investigate sensitivity of atmospheric optical properties to observed variability in aerosol and environmental properties. As expected, aerosol dry mass concentrations were the largest contributors to overall sensitivity of extensive optical properties. However, in terms of intensive optical properties, the range of expected variability due to internal variability within a given population type was on the same order as impacts expected due to differences between population types. Specific aerosol population type models may therefore provide little advantage for further constraining expected optical property variability in this dataset. Additionally, the combined impacts of variability in environmental relative humidity (RH) and intensive aerosol properties within a nominally consistent population type could be quantified with coefficients of variation on the order of 0.3 in this dataset—a value that was relatively constant and independent of total mass concentration, aerosol population type, and RH. Overall, this work produced new representations of fine-mode aerosol types encountered in marine environments that were broadly consistent with those currently applied in remote sensing and climate modeling. However, the models presented here can account explicitly for the effects of ambient relative humidity, and thus may be useful for next-generation modeling that includes those effects. Future work focused on similar observationally-constrained model development for the marine and littoral coarse mode would be beneficial, as large particles are often significant fractions of optical depth in these regions.