Theses and Dissertations
Permanent URI for this collectionhttps://hdl.handle.net/10217/100389
Browse
Recent Submissions
Item Open Access Investigating the applications of model reasoning for human-aware AI systems(Colorado State University. Libraries, 2025) Caglar, Turgay, author; Sreedharan, Sarath, advisor; Blanchard, Nathaniel, committee member; Krishnaswamy, Nikhil, committee member; Cleary, Anne, committee memberThis dissertation investigates how intelligent agents can reason over their models to better support, explain, and adapt to human users. Traditional AI planning assumes that the underlying model—describing the agent's actions, goals, and environment—is fixed and complete. However, in real-world deployments, these models often diverge from users' expectations, leading to confusion, mistrust, or failure. To address this, I propose a shift from reasoning within a model to reasoning about the model itself, using a framework called model-space search. Through four interconnected works, I demonstrate how model reasoning enables agents to operate more effectively in human-aware settings. First, I show how agents can proactively support users by detecting likely failure due to model misalignment and suggesting minimal corrections. Second, I extend explanation frameworks to include the intentions of system designers, revealing hidden influences on agent behavior. Third, I introduce Actionable Reconciliation Explanations, which combine model reconciliation and excuse generation to help users both understand and influence agent behavior. Finally, I explore how Large Language Models can enhance model-space search by guiding it toward more plausible and interpretable updates. Together, these contributions establish model reasoning as a foundation for building AI systems that are not only autonomous but also transparent, adaptable, and aligned with the people they serve.Item Open Access Balancing speed and precision: a comparative study of ASR systems in multimodal collaborative environments(Colorado State University. Libraries, 2025) Terpstra, Corbyn, author; Blanchard, Nathaniel, advisor; Ghosh, Sudipto, committee member; Cleary, Anne, committee memberAutomatic Speech Recognition (ASR) systems are increasingly critical for analyzing collaborative problem-solving (CPS) tasks, yet their segmentation and transcription accuracy in dynamic, multimodal environments remain underexplored. This study evaluates the performance of OpenAI's Whisper (Large, Medium, Turbo) and Vosk ASR systems in segmenting and transcribing collaborative dialogue, with a focus on implications for CPS annotation workflows. Leveraging a dataset of triads solving a multimodal task—comprising oracle (human-segmented), Google-segmented, and Whisper-segmented audio—we measure transcription accuracy via Word Error Rate (WER) and assess segmentation alignment through start time deviations, segment length ratios, and pause dynamics. Results reveal that while Whisper Turbo achieves the lowest overall WER (52.5%), its semantic segmentation strategy fragments coherent CPS moves, complicating annotation. Conversely, Vosk's pause-based approach under-segments rapid exchanges, obscuring interruptions and cross-talk. The study highlights a fundamental tension: Whisper prioritizes intent preservation at the cost of over-segmentation, while Vosk and Google ASR sacrifice nuance for efficiency. Annotation fidelity is further eroded by ASR-induced errors, including insertions (e.g., hallucinated phrases during silence) and temporal misalignments. These findings underscore the need for hybrid segmentation strategies and adaptive annotation frameworks that explicitly account for ASR limitations. Practical recommendations are proposed, including model-specific post-processing and context-aware annotation tools. By bridging technical evaluation with real-world application, this work advances the design of ASR systems tailored for collaborative environments, ensuring their outputs align with the complexities of human interaction.Item Open Access Optimizing sparse computations using union of Z-polyhedra(Colorado State University. Libraries, 2025) Tongli, Santoshkumar, author; Pouchet, Louis-Noël, advisor; Pallickara, Shrideep, committee member; Pasricha, Sudeep, committee memberSparse matrices play a central role in a wide range of modern computational problems. They are especially common in domains such as scientific simulations, numerical methods, graph analytics, machine learning, and high-performance computing workloads, where data is often structured in a way that leads to a significant number of zero-valued elements. Instead of treating these zeros as meaningful data, sparse matrix techniques aim to exploit this sparsity to reduce both storage and computational cost, thereby improving scalability and efficiency. The Union of Z-Polyhedra (UZP) sparse format models sparse structures as unions of integer polyhedra intersected with affine lattices, capturing both regular and irregular sparsity patterns in a unified form. Building on this abstraction, ur work introduces a suite of tuners that apply structural transformations to UZP representations without altering their mathematical semantics. These transformations improve data locality, Single Instruction Multiple Data (SIMD) vectorization, and parallelism, enabling performance tuning without modifying execution logic. Evaluated across 229 matrices from the SuiteSparse collection, the optimized UZP representations achieve highly competitive performance for sparse matrix-vector multiplication (SpMV) computations on multi-core CPUs, outperforming reference approaches such as Intel MKL's sparse implementation or formats dedicated to SIMD vectorization.Item Open Access Follow the signal: models of attention, reason, and belief(Colorado State University. Libraries, 2025) Venkatesha, Videep, author; Blanchard, Nathaniel, advisor; Krishnaswamy, Nikhil, committee member; Sreedharan, Sarath, committee member; Cleary, Anne, committee memberAttention, reasoning, and belief are central to how we perceive, decide, and collaborate. Though inherently abstract-with no direct physical manifestation these phenomena leave behind observable signals in subtle traces in gaze, language, timing, and interaction. These traces vary across individuals and contexts, yet they offer a window into the underlying cognitive processes. In this thesis, I model the behavioral and linguistic signals that reflect aspects of attentional shifts, expressions of reasoning, and evolving belief states, and investigate how machine learning can be used to detect and interpret them as they arise in everyday settings. First, I focus on moments of inward attention, identifying gaze patterns that predict when participants feel familiarity—even without conscious recall, using eye-tracking during immersive virtual tours. I then analyze written descriptions of three distinct internal attentional states: familiarity, unexpected thoughts, and involuntary memories. Then, I frame the link of probing questions i.e questions that explicitly elicit justifications or clarifications, and their causal utterances as traces of reason as they emerge in group dialogue Next, in the case of belief, I extract explicitly stated propositions from natural dialogue. These structured propositions reflect participants' evolving belief states during a collaborative task. I design and evaluate multiple extraction pipelines, demonstrating the feasibility of tracking belief expression in real time. Finally, I holistically examine how automated systems with noisy data shape downstream performance on collaborative problem-solving detection—a task that inherently reflects attention, belief, and reasoning. I show that, while performance remains comparable across systems, lower fidelity inputs reduce interpretive granularity. In combination, these contributions demonstrate how machine learning can detect the emergence of traces of these phenomena–—transforming these abstract states into observable patterns.Item Open Access The impact of manipulative content on human performance in augmented reality(Colorado State University. Libraries, 2025) Anspach, Evan D., author; Ray, Indrakshi, advisor; Arefin, Mohammed Sayafet, committee member; Martey, Rosa, committee memberExtended Reality is the spectrum of spaces and experiences, both virtual and augmented which include both Augmented Reality (AR) and Virtual Reality (VR). Of the two categories, Optical See-Through (OST) Augmented Reality is beginning to be used more widely in the public domain. However, addressing manipulative content is necessary for the widespread adoption of OST AR technology. Extended Reality (XR) Devices have had many vulnerabilities identified in previous works that may make them susceptible to the introduction of manipulative content, which an attacker may be able to use for a variety of purposes. For instance, in a cybersecurity context, attackers might try to influence and reduce user performance by changing the quality of AR information, introducing misleading content, irrelevant data, and other adverse factors. This may allow the attackers to control user behavior, slow down or stop important tasks performed in XR or to annoy or otherwise adversely affect the mental state of the XR user. This research investigates how helpful, misleading, and irrelevant information in OST AR affects human performance. The study used a memory task and employed a repeated measures design involving 19 participants. The findings revealed that the participants needed more time to complete the task when presented with irrelevant information compared to when they had access to useful AR information or when AR content was not presented. In addition, helpful AR information allowed users to complete the task more effectively with fewer errors than irrelevant and misleading AR information. The results suggest that AR enhances user memory, enabling them to perform tasks more efficiently. Moreover, when malicious information is introduced, manipulative content can effectively increase the decision-making time of their targets by disrupting memory-based judgments.Item Embargo Images in motion?: a first look into video leakage in federated learning(Colorado State University. Libraries, 2025) Rasul, Md Fazle, author; Ray, Indrakshi, advisor; Jayasumana, Anura P., committee member; Bezawada, Bruhadeshwar, committee member; Simske, Steve, committee memberFederated learning (FL) allows multiple entities to train a shared model collaboratively. Its core, privacy-preserving principle is that participants only exchange model updates, such as gradients, and never their raw, sensitive data. This approach is fundamental for applications in domains where privacy and confidentiality are important. However, the security of this very mechanism is threatened by gradient inversion attacks, which can reverse-engineer private training data directly from the shared gradients, defeating the purpose of FL. While the impact of these attacks is known for image, text, and tabular data, their effect on video data remains an unexamined area of research. This paper presents the first analysis of video data leakage in FL via gradient inversion attacks. We evaluate two common video classification approaches: one employing pre-trained feature extractors and another that processes raw video frames with simple transformations. Our results indicate that the use of feature extractors offers greater resilience against gradient inversion attacks. We also demonstrate that image super-resolution techniques can enhance the frames, extracted through gradient inversion attacks, enabling attackers to reconstruct higher-quality videos. Our experiments validate this across scenarios where the attacker has access to zero, one, or more reference frames from the target environment. We find that although feature extractors make attacks more challenging, leakage is still possible if the classifier lacks sufficient complexity. We, therefore, conclude that video data leakage in FL is a viable threat and the conditions under which it occurs warrant further investigation.Item Open Access Spectral partitioning of graphs into compact, connected regions(Colorado State University. Libraries, 2025) Kampbell, Maxine F., author; Davies, Ewan, advisor; Rajopadhye, Sanjay, committee member; Wilson, James, committee memberPartitioning a graph into regions that are both compact and connected is an important problem with applications in many areas, for example, circuit design, social network analysis, and electoral redistricting. Our work builds on existing ideas of spectral bipartitioning and Markov chain Monte Carlo (MCMC) recombination methods to provide a new method for partitioning graphs: spectral recombination. Previous methods utilized these ideas independently, for example, partitioning a graph directly using its spectrum or using MCMC recombination methods that do not rely on its spectrum. Our work represents a novel approach that combines the two ideas. We provide empirical evidence that spectral recombination methods generate partitions with low cut edge counts, that is, more compact regions with shorter boundaries. Moreover, we demonstrate that our base spectral recombination algorithm can be modified to prioritize different metrics, such as balanced vertex weights among regions. We note that there appears to be a trade-off between achieving low cut edge counts and maintaining approximate weight balance, illuminating an avenue for future research. Our code and data can be found at https://github.com/MaxFlorescence/spectral_redistricting.Item Open Access Reducing goal state divergence with environment design(Colorado State University. Libraries, 2025) Sikes, Kelsey, author; Sreedharan, Sarath, advisor; Blanchard, Nathaniel, committee member; Chong, Edwin K.P., committee memberAt the core of most successful human-robot collaborations is alignment between a robot's behavior and a human's expectations. Achieving this alignment is often difficult, however, because without careful specification, a robot may misinterpret a human's goals, causing it to perform actions with unexpected, if not dangerous side effects. To avoid this, I propose a new metric called Goal State Divergence (GSD), which represents the difference between the final goal state achieved by a robot and the one a human user expected. In cases where GSD cannot be directly calculated, I show how it can be approximated using maximal and minimal bounds. I then leverage GSD in my novel human-robot goal alignment design (HRGAD) problem, which identifies a minimal set of environment modifications that can reduce such mismatches. To illustrate the effectiveness of my method for reducing goal state divergence, I then empirically evaluate it on several standard planning benchmarks.Item Open Access Extremal values of the occupancy fraction for the antiferromagnetic Ising model(Colorado State University. Libraries, 2025) LeBlanc, Olivia, author; Davies, Ewan, advisor; Rajopadhye, Sanjay, committee member; Prabhu, Vinayak, committee member; Gillespie, Maria, committee memberThe Ising model is a mathematical model of magnetism which is frequently studied in statistical physics and computer science. For the antiferromagnetic version of the model, there is known to be a computational threshold in the complexity of sampling from the model at given magnetization on ∆-regular graphs. The value of this threshold can be determined by minimizing the occupancy fraction of the model, but prior to this paper an explicit formula was not known. This work solves the minimization problem for the majority of the relevant parameter space in the case ∆ = 3, determining the value of this threshold. Our methods also yield results on the minimization and maximization problems in other areas of the parameter space, painting a more complete picture of the occupancy fraction's behavior in 3-regular graphs.Item Embargo Scalable predictive modeling for spatiotemporally evolving phenomena(Colorado State University. Libraries, 2025) Khandelwal, Paahuni, author; Pallickara, Sangmi Lee, advisor; Pallickara, Shrideep, committee member; Ghosh, Sudipto, committee member; Andales, Allan, committee memberSpatiotemporally evolving phenomena occur in epidemiology, atmospheric sciences, agriculture, and traffic management, among others. Models can be used to understand and inform decision-making in these settings. There has been a growth in both mechanistic and physics-informed methods to model phenomena. A challenge in such models is the need for extensive parametrization and calibration, which can be difficult for modeling phenomena at the continental scale. This has occurred alongside the availability of diverse data that can be leveraged by model-fitting algorithms. This dissertation focuses on leveraging deep learning methods to model spatiotemporally evolving phenomena by combining sparse but high-precision in situ measurement data with voluminous, low-precision satellite imagery. The research explores techniques to integrate scientific models and make use of diverse data sources, overcoming their disparities in precision, spatial coverage, and temporal resolution. We explore several methods to leverage scientific models and harness available data despite disparities in their precision, spatial scope, or temporal frequencies. We also regulate how the networks learn by designing custom multipart loss functions that combine traditional measures of accuracy alongside physics/domain-informed variability. As data volumes increase, there is a corresponding increase in the resource requirements – GPU, memory, disk, and network I/O – requirements for model training. To address scalability issues, we designed a framework that manages multi-dimensional data volumes, partitions data effectively, curtails modeling costs, and transfer learning schemes to improve the efficiency of model training workflows. By incorporating scientific knowledge into the learning process, this research addresses the challenges of limited data availability and the data-intensive nature of deep neural networks. The method generalize effectively, enabling the way for scalable and accurate models in data-scarce domains.Item Embargo Multi-stream deep learning for isolated sign language recognition in videos(Colorado State University. Libraries, 2025) Alsharif, Muhammad H., author; Anderson, Charles, advisor; Kirby, Michael, committee member; Blanchard, Nathaniel, committee member; Peterson, Christopher, committee memberIsolated sign language recognition is the task of identifying signs performed in isolation across multiple frames in a video. Advances in this field have significant implications, such as improving visual communication between humans and machines and bridging the communication gap between deaf and hearing individuals. However, practical applications of this domain have been limited by two key challenges: the computational complexity of current models and the limited availability of training data for many vocabularies in sign languages. This dissertation addresses these challenges driven by improving recognition accuracy and computational efficiency. 3D convolutional models with RGB and optical flow inputs have been widely utilized in state-of-the-art methods for action recognition. Despite their significant computational costs, a systematic evaluation of their contribution to sign recognition has been limited. We first evaluate the effectiveness of 3D convolutional networks, showing that they significantly outperform their 2D counterparts on several sign recognition datasets, even when compared to a deeper 2D architecture. Additionally, this research challenges conventual assumptions about optical flow, demonstrating through ablation studies that its primary value lies in masking irrelevant (static) regions rather than improving the learning of motion patterns for sign recognition. In addition to RGB and optical flow, this work investigates skeleton-based sign language recognition using recurrent, transformer, and spatiotemporal convolutional graph networks. Our experimental results demonstrate the importance of the spatiotemporal sparse graph representation of skeleton data (coordinates of body and hand joints) in improving accuracy and interpretability through edge importance weighting. To address the limited number of training data for many signs, we propose a coarse-to-fine transfer learning approach to adapt spatiotemporal features learned from large action recognition and Turkish sign language datasets to American sign language datasets. This approach results in significant improvement for multiple modalities and benchmarks. To combine different models in a multi-stream network, we propose several methods for fusing the stream outputs before and after classification. To find the best combination of models using RGB, optical flow, or skeleton as input modalities, we train and evaluate all possible combinations in two- and three-stream networks on three sign recognition datasets. Our findings show that combining RGB and skeleton-based streams provides the most significant gain over the RGB baseline, due to greater diversity in stream predictions. In contrast, combining RGB and optical flow-based streams significantly increases the computational cost, due to optical flow extraction, without improving accuracy over two RGB streams. Our two- and three-stream networks, using only RGB and skeleton data as input modalities, achieve new state-of-the-art accuracy on the two largest ASL video datasets, which include 1000 and 2000 signs. Our approach achieves over 90% top-5 recognition accuracy on all benchmarks while significantly reducing computational costs compared to state-of-the-art methods. These findings facilitate real-time applications on mobile devices aimed at improving convenience in the daily lives of deaf individuals and helping to overcome communication barriers.Item Open Access Optimizations of polyhedral reductions and their use in algorithm-based fault tolerance(Colorado State University. Libraries, 2025) Narmour, Louis, author; Rajopadhye, Sanjay, advisor; Pouchet, Louis-Noël, committee member; Prabhu, Vinayak, committee member; Pezeshki, Ali, committee memberIn this dissertation, we study the optimization of programs containing reductions and motivate a deeper connection between two ostensibly unrelated problems, one involving techniques for algorithmic improvement and another in the domain of Algorithm-Based Fault Tolerance. Reductions combine collections of inputs with an associative and often commutative operator to produce collections of outputs. Such operations are interesting because they often require special handling to obtain good performance. When the same value contributes to multiple outputs, there is an opportunity to reuse partial results, enabling reduction simplification. Prior work showed how to exploit this and obtain a reduction (pun intended) in the program's asymptotic complexity through a program transformation called simplification. We propose extensions to prior work on simplification and provide the first complete push-button implementation of reduction simplification in a compiler and show how to handle a strictly more general class of programs than previously supported. We evaluate its effectiveness and show that simplification rediscovers several key results in algorithmic improvement across multiple domains, previously only obtained through clever manual human analysis and effort. Additionally, we complement this and study the automation of generalized and automated fault tolerance against transient errors, such as those occurring due to cosmic radiation or hardware component aging and degradation, using Algorithm-Based Fault Tolerance (ABFT). ABFT methods typically work by adding some redundant computation in the form of invariant checksums (i.e., reductions), which, by definition, should not change as the program executes. By computing and monitoring checksums, it is possible to detect errors by observing differences in the checksum values. However, this is challenging for two key reasons: (1) it requires careful manual analysis of the input program, and (2) care must be taken to subsequently carry out the checksum computations efficiently enough for it to be worth it. We propose automation techniques for a class of scientific codes called stencil computations and give methods to carry out this analysis at compile time. This is the first work to propose such an analysis in a compiler.Item Open Access Formal verification of source-to-source transformations for high-level synthesis(Colorado State University. Libraries, 2025) Tucker, Emily, author; Pouchet, Louis-Noël, advisor; Prabhu, Vinayak, committee member; Ortega, Francisco, committee member; Wilson, James, committee memberHardware processors are designed using a complex optimization flow, starting from a high-level description of the functionalities to be implemented. This description is then progressively lowered to concrete hardware: Register-Transfer Level (RTL) functional behavior, timing between operations, and eventually actual logic gates are produced. High-level synthesis (HLS) can greatly facilitate the description of complex hardware implementations, by raising the level of abstraction up to a classical imperative language such as C/C++, usually augmented with vendor-specific pragmas and APIs. HLS automatically compiles a large class of C/C++ programs to highly optimized RTL. Despite productivity improvements, attaining high performance for the final design remains a challenge, and higher-level tools like source-to-source compilers have been developed to generate programs targeting HLS toolchains. These tools may generate highly complex HLS-ready C/C++ code, reducing the programming effort and enabling critical optimizations. However, whether these HLS-friendly programs are produced by a human or a tool, validating their correctness or exposing bugs otherwise remains a fundamental challenge. In this work we target the problem of efficiently checking the semantic equivalence between two programs written in C/C++ as a means to ensuring the correctness of the description provided to the HLS toolchain, by proving an optimized code version fully preserves the semantics of the unoptimized one. We introduce a novel formal verification approach that combines concrete and abstract interpretation with a hybrid symbolic analysis. Notably, our approach is mostly agnostic to how control-flow, data storage, and dataflow are implemented in the two programs. It can prove equivalence under complex bufferization and loop/syntax transformations, for a rich class of programs with statically interpretable control-flow. We present our techniques and their complete end-to-end implementation, demonstrating how our system can verify the correctness of highly complex programs generated by source-to-source compilers for HLS, and detect bugs that may elude co-simulation.Item Open Access Time series analysis over sparse, non-stationary datasets with variational mode decomposition and transfer learning(Colorado State University. Libraries, 2025) Patterson, Katherine, author; Pallickara, Shrideep, advisor; Pallickara, Sangmi, advisor; Andales, Allan, committee memberData volumes have been growing exponentially across many domains. However, in fields such as ecology and environmental monitoring, data remains sparse, creating unique challenges. One such challenge is detecting extreme events (sudden spikes or anomalies in the data) and understanding their causes based on spatiotemporal patterns. The difficulty is exacerbated by time lags between an observed outlier and its underlying trigger, making causal attribution and forecasts difficult. These challenges have implications, particularly for environmental protection and regulatory compliance. This thesis explores the issue of time-series analysis over sparse, non-stationary datasets to support outlier detection and forecasts. We mitigate non-stationarity using variational mode decomposition (VMD) to break the signal into multiple seasonal components. To tackle the challenges of long-term seasonality, we leverage information obtained from the frequency domain regarding dominant lagged relationships within these signals. Finally, we leverage transfer learning to warm-start models at spatial extents where the data are sparse. We validate these ideas in the context of nutrient runoff into surface waters, where identifying and explaining anomalies is critical for the protection of ecosystems. Challenges arise due to three main factors: (1) nutrient time series are naturally non-stationary, which complicates the identification of underlying patterns; (2) temporal models often struggle over an entire season's span; and (3) water quality measurements are often sporadic and sparse. Results showed that the historical similarity mapping of these spatiotemporal profiles and their frequency-motivated seasonality characteristics improved prediction performance in each target series. Additionally, the final proposed model captured more series fluctuations than the base models.Item Embargo Harnessing large language models for permission fidelity analysis from android application descriptions(Colorado State University. Libraries, 2025) Tamrakar, Yunik, author; Ray, Indrakshi, advisor; Banerjee, Ritwik, advisor; Ghosh, Sudipto, committee member; Simske, Steve, committee memberAndroid applications are very popular these days and as of mid-2024 there are over 2 million applications in the Google Play Store. With such a large number of applications available for download, the threat of privacy leakage increases considerably, primarily due to the users' limited knowledge in distinguishing the necessary app permissions. This makes accurate and consistent checking of the permissions collected by the applications necessary to ensure the protection of the user's privacy. Studies have indicated that inferring permissions from app descriptions is an effective way to determine whether the collected permissions are necessary or not. Previous research in the permission inference space has explored techniques such as keyword-based matching, Natural Language Processing methods (including part-of-speech tagging and named entity recognition), as well as deep learning based approaches using Recurrent Neural Networks. However, app descriptions are often vague and may omit details to meet sentence length restrictions, resulting in suboptimal performance of these models. This limitation motivated our choice of large language models (LLMs), as their advanced contextual understanding and ability to infer implicit information can directly address the weaknesses observed in previous approaches. In this work, we explore various LLM architectures for the permission inference task and provide a detailed comparison across various models. We evaluate both zero-shot learning and fine-tuning based approaches, demonstrating that fine-tuned models can achieve state-of-the-art performance. Additionally, by employing targeted generative AI based training data augmentation techniques, we show that these fine-tuned models can significantly outperform baseline methods. Furthermore, we illustrate the potential of leveraging paraphrasing to boost fine-tuned performance by over 50 percent, all while using only a very small number of annotated samples—a rarity for LLMs.Item Open Access Resiliency analysis of mission-critical systems using formal methods(Colorado State University. Libraries, 2025) Abdelgawad, Mahmoud A., author; Ray, Indrakshi, advisor; Malaiya, Yashwant, committee member; Sreedharan, Sarath, committee member; Daily, Jeremy, committee memberMission-critical systems, such as navigational spacecraft and drone surveillance systems, play a crucial role in a nation's safety and security. These systems consist of heterogeneous systems that work together to accomplish critical missions. However, they are susceptible to cyberattacks and physical incidents that can have devastating consequences. Thus, missions must be designed so that mission-critical systems can withstand adverse events and continue to operate effectively, even with the occurrence of adverse events. In other words, critical mission engineers must specify, analyze, and anticipate potential threats, identify where adverse events may occur, and develop mitigation strategies before deploying a mission-critical system. This work presents an end-to-end methodology for analyzing the resiliency of critical missions. The methodology first specifies a mission in the form of a workflow. The mission workflow is then converted into a formal representation using Colored Petri Nets (CPN). Threat models are also extracted from the mission specification to tackle the CPN mission with various attack scenarios. These threat models are represented as CPN attacks. The methodology exploits the state transitions of the CPN mission attached to CPN attacks to analyze the resiliency of the mission. The analysis identifies which states the mission succeeds, fails, and is incomplete. We established a mission for a mission-critical formation consisting of a military vehicle and two route reconnaissance drones that collaborate to monitor a national border and respond promptly to physical threats. The effectiveness of the methodology is demonstrated in identifying vulnerabilities, modeling adversarial conditions, and evaluating mission continuity under disruptions. The result shows how to refine the mission to enhance the resilience of such formations. The findings contribute to the early-stage resilience analysis framework and help address the limitations associated with manual verification of mission-critical systems.Item Open Access Enabling programmatic interfaces for explorations over voluminous spatiotemporal data collections(Colorado State University. Libraries, 2025) Barram, Kassidy M., author; Pallickara, Shrideep, advisor; Pallickara, Sangmi, advisor; Arabi, Mazdak, committee memberThis thesis focuses on enabling programmatic interfaces to perform exploratory analyses over voluminous data collections. The data we consider can be encoded in diverse formats and managed using diverse data storage frameworks. Our framework, Scrybe, manages the competing pulls of expressive computations and the need to conserve resource utilization in shared clusters. The framework includes support for differentiated quality of services allowing preferentially higher resource utilization for certain users. We have validated our methodology with voluminous data collections housed in relational, NoSQL/document, and hybrid storage systems. Our benchmarks demonstrate the effectiveness of our methodology across evaluation metrics such as latencies, throughputs, preservation of resource thresholds, and differentiated services. These quantitative measures of performance are complemented using qualitative metrics that profile user interactions with the framework.Item Open Access Preventing malicious modifications to firmware using hardware root of trust (HRoT)(Colorado State University. Libraries, 2025) Podder, Rakesh, author; Ray, Indrajit, advisor; Sreedharan, Sarath, advisor; Ray, Indrakshi, committee member; Jayasumana, Anura, committee memberAs computing devices such as servers, workstations, laptops, and embedded systems are transported from one site to another, they are susceptible to unauthorized firmware modifications. Additionally, traditional over-the-air (OTA) firmware update mechanisms often lack robust security features, exposing devices to threats such as unauthorized updates, malware injection, etc. While the industry has made efforts to secure the boot process using a hardware root of trust (HRoT), post-boot firmware tampering remains a significant risk. In this work, we introduce a comprehensive framework that addresses firmware security across both transit and remote update phases by leveraging HRoT and cryptographic techniques. To prevent unauthorized firmware modifications during device shipment, we propose the PIT-Cerberus (Protection In Transit) framework, which enhances the HRoT's attestation capabilities to securely lock and unlock BIOS/UEFI. In addition, we introduce the Secure Remote Firmware Update Protocol (S-RFUP) to fortify OTA firmware updates by incorporating industry standards such as Platform Level Data Model (PLDM) and Management Component Transport Protocol (MCTP). These standards enable interoperability across diverse platforms while reducing management complexity. The protocol enhances security and operational integrity during updates, ensuring that only authenticated and verified firmware modifications occur. Both frameworks are implemented within a trusted microcontroller as part of Project Cerberus, an open-source security platform for server hardware. We present a security analysis, implementation details, and validation results, demonstrating the effectiveness of our approach in securing firmware both in transit and during remote updates.Item Open Access Rapid interactive explorations of voluminous spatial temporal datasets(Colorado State University. Libraries, 2025) Young, Matthew Branley, author; Pallickara, Shrideep, advisor; Pallickara, Sangmi, advisor; Arabi, Mazdak, committee memberSpatial data volumes have grown exponentially alongside the proliferation of sensing equipment and networked observational devices. In this thesis, we describe the framework aQua for performing visualizations and exploration of spatiotemporally evolving phenomena at scale, and Rubiks, which supports effective summarizations and explorations at scale over arbitrary spatiotemporal scopes, which encapsulate the spatial extents, temporal bounds, or combinations thereof over the data space of interest. We validate these ideas in the context of data from the National Hydrology Database (NHD) and the Environmental Protection Agency (EPA) to support longitudinal analysis (53 years of data) for the vast majority of water bodies in the United States. Our methodology addresses issues relating to preserving interactivity, effective analysis, dynamic query generation, and scaling. We extend the concept of data cubes to encompass spatiotemporal datasets with high-dimensionality and where there might be significant gaps in the data because measurements (or observations) of diverse variables are not synchronized and may occur at diverse rates. We consider optimizations and refinements at the server-side, client-side, and how information exchange occurs between the client and server-side. We report both quantitative and qualitative assessments of several aspects of our tool to demonstrate its suitability. Finally, our methodology is broadly applicable to domains where visualization-driven explorations of spatiotemporally evolving phenomena are needed.Item Embargo Privacy threats to mobile health apps: an analysis of data collection practices(Colorado State University. Libraries, 2025) Myers, Charles Ethan, author; Ray, Indrakshi, advisor; Ortega, Francisco, committee member; Ray, Indrajit, committee member; Jayasumana, Anura, committee memberUsers often install mobile health applications (mHealth apps) to improve their health and lifestyle. mHealth apps collect sensitive personal health related information and may share them with various stakeholders. Many of these mHealth apps that consumers use for their personal lifestyle benefits are not required to be compliant with any regulation, such as the Health Insurance Portability and Accountability Act (HIPAA) or General Data Protection Regulation (GDPR). Our investigation reveals that there is a mismatch between what an app description states about privacy, what permissions it requests from the end user as declared in its manifest file, privacy regulations (GDPR), and what privacy practices are actually enforced by the app. We provide a formal definition of mHealth apps and discuss an automated approach that uses a pre-trained language model to identify and analyze 13,177 mHealth apps from Google Playstore. We identify the ten most common privacy threats in mHealth apps and map them to GDPR policy violations. Privacy violations pertaining to GDPR include the absence of a consent management system, inconsistent permissions with respect to the app description, and sharing personally identifiable information (PII) without consent. Our analysis reveals that only 4.28% had a consent mechanism, over 88% of app network transmissions shared some form of personally identifiable information (PII) without consent, and nearly 83.7% requested permissions from the users without explaining their use cases. Our research has been successful in building automated tools for detecting privacy violations for some, but not all, of the identified threats.