Repository logo
 

Theses and Dissertations

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 283
  • ItemOpen Access
    Scalable and efficient tools for multi-level tiling
    (Colorado State University. Libraries, 2008) Renganarayana, Lakshminarayanan, author; Rajopadhye, Sanjay, advisor
    In the era of many-core systems, application performance will come from parallelism and data locality. Effective exploitation of these require explicit (re)structuring of the applications. Multilevel (or hierarchical) tiling is one such structuring technique used in almost all high-performance implementations. Lack of tool support has limited the use of multi-level tiling to program optimization experts. We present solutions to two fundamental problems in multi-level tiling, viz., optimal tile size selection and parameterized tiled loop generation. Our solutions provide scalable and efficient tools for multi-level tiling. Parameterized tiled code refers to tiled loops where the tile sizes are not (fixed) compile-time constants but are left as symbolic parameters. It can enable selection and adaptation of tile sizes across a spectrum of stages through compilation to run-time. We introduce two polyhedral sets, viz., inset and outset, and use them to develop a variety of scalable and efficient multi-level tiled loop generation algorithms. The generation efficiency and code quality are demonstrated on a variety of benchmarks such as stencil computations and matrix subroutines from BLAS. Our technique can generate tiled loop nests with parameterized, fixed or mixed tile sizes, thereby providing a one-size-fits all solution ideal for inclusion in production compilers. Optimal tile size selection (TSS) refers to the selection of tile sizes that optimize some cost (e.g., execution time) model. We show that these cost models share a fundamental mathematical property, viz., positivity, that allows us to reduce optimal TSS to convex optimization problems. Almost all TSS models proposed in the literature for parallelism, caches, and registers, lend themselves to this reduction. We present the reduction of five different TSS models proposed in the literature by different authors in a variety of tiling contexts. Our convex optimization based TSS framework is the first one to provide a solution that is both efficient and scalable to multiple levels of tiling.
  • ItemOpen Access
    Improving software maintainability through aspectualization
    (Colorado State University. Libraries, 2009) Mortensen, Michael, author; Ghosh, Sudipto, advisor; Bieman, James M., advisor
    The primary claimed benefits of aspect-oriented programming (AOP) are that it improves the understandability and maintainability of software applications by modularizing cross-cutting concerns. Before there is widespread adoption of AOP, developers need further evidence of the actual benefits as well as costs. Applying AOP techniques to refactor legacy applications is one way to evaluate costs and benefits. Aspect-based refactoring, called aspectualization, involves moving program code that implements cross-cutting concerns into aspects. Such refactoring can potentially improve the maintainability of legacy systems. Long compilation and weave times, and the lack of an appropriate testing methodology are two challenges to the aspectualization of large legacy systems. We propose an iterative test driven approach for creating and introducing aspects. The approach uses mock systems that enable aspect developers to quickly experiment with different pointcuts and advice, and reduce the compile and weave times. The approach also uses weave analysis, regression testing, and code coverage analysis to test the aspects. We developed several tools for unit and integration testing. We demonstrate the test driven approach in the context of large industrial C++ systems, and we provide guidelines for mock system creation. This research examines the effects on maintainability of replacing cross-cutting concerns with aspects in three industrial applications. We study several revisions of each application, identifying cross-cutting concerns in the initial revision, and also cross-cutting concerns that are added in later revisions. Aspectualization improved maintainability by reducing code size and improving both change locality and concern diffusion. Costs include the effort required for application refactoring and aspect creation, as well as a small decrease in performance.
  • ItemOpen Access
    Exploring the bias of direct search and evolutionary optimization
    (Colorado State University. Libraries, 2008) Lunacek, Monte, author; Whitley, Darrell, advisor
    There are many applications in science that yield the following optimization problem: given an objective function, which set of input decision variables produce the largest or smallest result? Optimization algorithms attempt to answer this question by searching for competitive solutions within an application's domain. But every search algorithm has some particular bias. Our results show that search algorithms are more effective when they cope with the features that make a particular application difficult. Evolutionary algorithms are stochastic population-based search methods that are often designed to perform well on problems containing many local optima. Although this is a critical feature, the number of local optima in the search space is not necessarily indicative of problem difficulty. The objective of this dissertation is to investigate how two relatively unexplored problem features, ridges and global structure, impact the performance of evolutionary parameter optimization. We show that problems containing these features can cause evolutionary algorithms to fail in unexpected ways. For example, the condition number of a problem is one way to quantify a ridge feature. When a simple unimodal surface has a high condition number, we show that the resulting narrow ridge can make many evolutionary algorithms extremely inefficient. Some even fail. Similarly, funnels are one way of categorizing a problem's global structure. A single-funnel problem is one where the local optima are clustered together such that there exists a global trend toward the best solution. This trend is less predicable on problems that contain multiple funnels. We describe a metric that distinguishes problems based on this characteristic. Then we show that the global structure of the problem can render successful global search strategies ineffective on relatively simple multi-modal surfaces. Our proposed strategy that performs well on problems with multiple funnels is counter-intuitive. These issues impact two real-world applications: an atmospheric science inversion model and a configurational chemistry problem. We find that exploiting ridges and global structure results in more effective solutions on these difficult real-world problems. This adds integrity to our perspective on how problem features interact with search algorithms, and more clearly exposes the bias of direct search and evolutionary algorithms.
  • ItemOpen Access
    Stability analysis of recurrent neural networks with applications
    (Colorado State University. Libraries, 2008) Knight, James N., author; Anderson, Charles W., advisor
    Recurrent neural networks are an important tool in the analysis of data with temporal structure. The ability of recurrent networks to model temporal data and act as dynamic mappings makes them ideal for application to complex control problems. Because such networks are dynamic, however, application in control systems, where stability and safety are important, requires certain guarantees about the behavior of the network and its interaction with the controlled system. Both the performance of the system and its stability must be assured. Since the dynamics of controlled systems are never perfectly known, robust control requires that uncertainty in the knowledge of systems be explicitly addressed. Robust control synthesis approaches produce controllers that are stable in the presence of uncertainty. To guarantee robust stability, these controllers must often sacrifice performance on the actual physical system. The addition of adaptive recurrent neural network components to the controller can alleviate, to some extent, the loss of performance associated with robust design by allowing adaptation to observed system dynamics. The assurance of stability of the adaptive neural control system is prerequisite to the application of such techniques. Work in [49, 2] points toward the use of modern stability analysis and robust control techniques in combination with reinforcement learning algorithms to provide adaptive neural controllers with the necessary guarantees of performance and stability. The algorithms developed in these works have a high computational burden due to the cost of the online stability analysis. Conservatism in the stability analysis of the adaptive neural components has a direct impact on the cost of the proposed system. This is due to an increase in the number of stability analysis computations that must be made. The work in [79, 82] provides more efficient tools for the analysis of time-varying recurrent neural network stability than those applied in [49, 2]. Recent results in the analysis of systems with repeated nonlinearities [19, 52, 17] can reduce the conservatism of the analysis developed in [79] and give an overall improvement in the performance of the on-line stability analysis. In this document, steps toward making the application of robust adaptive neural controllers practical are described. The analysis of recurrent neural network stability in [79] is not exact and reductions in the conservatism and computational cost of the analysis are presented. An algorithm is developed allowing the application of the stability analysis results to online adaptive control systems. The algorithm modifies the recurrent neural network updates with a bias away from the boundary between provably stable parameter settings and possibly unstable settings. This bias is derived from the results of the stability analysis, and its method of computation is applicable to a broad class of adaptive control systems not restricted to recurrent neural networks. The use of this bias term reduces the number of expensive stability analysis computations that must be made and thus reduces the computational complexity of the stable adaptive system. An application of the proposed algorithm to an uncertain, nonlinear, control system is provided and points toward future work on this problem that could further the practical application of robust adaptive neural control.
  • ItemOpen Access
    Decay and grime buildup in evolving object oriented design patterns
    (Colorado State University. Libraries, 2009) Izurieta, Clemente, author; Bieman, James M., advisor
    Software designs decay as systems, uses, and operational environments evolve. As software ages the original realizations of design patterns may remain in place, while participants in design pattern realizations accumulate grime-non-pattern-related code. This research examines the extent to which software designs actually decay, rot and accumulate grime by studying the aging of design patterns in successful object oriented systems. By focusing on design patterns we can identify code constructs that conflict with well formed pattern structures. Design pattern rot is the deterioration of the structural integrity of a design pattern realization. Grime buildup in design patterns is a form of decay that does not break the structural integrity of a pattern but can reduce system testability and adaptability. Grime is measured using various types of indices developed and adapted for this research. Grime indices track the internal structural changes in a design pattern realization and the code that surrounds the realization. In general we find that the original pattern functionality remains, and pattern decay is primarily due to grime and not rot. We characterize the nature of grime buildup in design patterns, provide quantifiable evidence of such grime buildup, and find that grime can be classified at organizational, modular and class levels. Organizational level grime refers to namespace and physical file constitution and structure. Metrics at this level help us understand if rot and grime buildup play a role in fomenting disorganization of design patterns. Measures of modular level grime can help us to understand how the coupling of classes belonging to a design pattern develops. As dependencies between design pattern components increase without regard for pattern intent, the modularity of a pattern deteriorates. Class level grime is focused on understanding how classes that participate in design patterns are modified as systems evolve. For each level we use different measurements and surrogate indicators to help analyze the consequences that grime buildup has on testability and adaptability of design patterns. Test cases put in place during the design phase and initial implementation of a project can become ineffective as the system matures. The evolution of a design due to added functionality or defect fixing increases the coupling and dependencies between classes that must be tested. We show that as systems age, the growth of grime and the appearance of anti-patterns (a form of decay) increase testing requirements. Additionally, evidence suggests that, as pattern realizations evolve, the levels of efferent and afferent coupling of the classifiers that participate in patterns increase. Increases in coupling measurements suggest dependencies to and from other software artifacts thus reducing the adaptability and comprehensibility of the pattern. In general we find that grime buildup is most serious at a modular level. We find little evidence of class and organizational grime. Furthermore, we find that modular grime appears to have higher impacts on testability than adaptability of design patterns. Identifying grime helps developers direct refactoring efforts early in the evolution of software, thus keeping costs in check by minimizing the effects of software aging. Long term goals of this research are to curtail the effects of decay by providing the understanding and means necessary to diminish grime buildup.
  • ItemOpen Access
    A systematic approach to testing UML designs
    (Colorado State University. Libraries, 2007) Dinh-Trong, Trung T., author; France, Robert B., advisor; Ghosh, Sudipto, advisor
    In Model Driven Engineering (MDE) approaches, developers create and refine design models from which substantial portions of implementations are generated. During refinement, undetected faults in an abstract model can traverse into the refined models, and eventually into code. Hence, finding and removing faults in design models is essential for MDE approaches to succeed. This dissertation describes a testing approach to finding faults in design models created using the Unified Modeling Language (UML). Executable forms of UML design models are exercised using generated test inputs that provide coverage with respect to UML-based coverage criteria. The UML designs that are tested consist of class diagrams, sequence diagrams and activity diagrams. The contribution of the dissertation includes (1) a test input generation technique, (2) an approach to execute design models describing sequential behavior with test inputs in order to detect faults, and (3) a set of pilot studies that are carried out to explore the fault detection capability of our testing approach. The test input generation technique involves analyzing design models under test to produce test inputs that satisfy UML sequence diagram coverage criteria. We defined a directed graph structure, named Variable Assignment Graph (VAG), to generate test inputs. The VAG combines information from class and sequence diagrams. Paths are selected from the VAG and constraints are identified to traverse the paths. The constraints are then solved with a constraint solver. The model execution technique involves transforming each design under test into an executable form, which is exercised with the generated inputs. Failures are reported if the observed behavior differs from the expected behavior. We proposed an action language, named Java-like Action Language (JAL), that supports the UML action semantics. We developed a prototype tool, named UMLAnT, that performs test execution and animation of design models. We performed pilot studies to evaluate the fault detection effectiveness of our approach. Mutation faults and commonly occurring faults in UML models created by students in our software engineering courses were seeded in three design models. Ninety percent of the seeded faults were detected using our approach.
  • ItemOpen Access
    A vector model of trust to reason about trustworthiness of entities for developing secure systems
    (Colorado State University. Libraries, 2008) Chakraborty, Sudip, author; Ray, Indrajit, advisor; Ray, Indrakshi, advisor
    Security services rely to a great extent on some notion of trust. In all security mechanisms there is an implicit notion of trustworthiness of the involved entities. Security technologies like cryptographic algorithms, digital signature, access control mechanisms provide confidentiality, integrity, authentication, and authorization thereby allow some level of 'trust' on other entities. However, these techniques provide only a restrictive (binary) notion of trust and do not suffice to express more general concept of 'trustworthiness'. For example, a digitally signed certificate does not tell whether there is any collusion between the issuer and the bearer. In fact, without a proper model and mechanism to evaluate and manage trust, it is hard to enforce trust-based security decisions. Therefore there is a need for more generic model of trust. However, even today, there is no accepted formalism for specifying and reasoning with trust. Secure systems are built under the premise that concepts like "trustworthiness" or "trusted" are well understood, without agreeing to what "trust" means, what constitutes trust, how to measure it, how to compare or compose two trusts, and how a computed trust can help to make a security decision.
  • ItemOpen Access
    Assessing vulnerabilities in software systems: a quantitative approach
    (Colorado State University. Libraries, 2007) Alhazmi, Omar, author; Malaiya, Yashwant K., advisor; Ray, Indrajit, advisor
    Security and reliability are two of the most important attributes of complex software systems. It is now common to use quantitative methods for evaluating and managing reliability. Software assurance requires similar quantitative assessment of software security, however only limited work has been done on quantitative aspects of security. The analogy with software reliability can help developing similar measures for software security. However, there are significant differences that need to be identified and appropriately acknowledged. This work examines the feasibility of quantitatively characterizing major attributes of security using its analogy with reliability. In particular, we investigate whether it is possible to predict the number of vulnerabilities that can potentially be identified in a current or future release of a software system using analytical modeling techniques.
  • ItemUnknown
    Towards interactive analytics over voluminous spatiotemporal data using a distributed, in-memory framework
    (Colorado State University. Libraries, 2023) Mitra, Saptashwa, author; Pallickara, Sangmi Lee advisor; Pallickara, Shrideep, committee member; Ortega, Francisco, committee member; Li, Kaigang, committee member
    The proliferation of heterogeneous data sources, driven by advancements in sensor networks, simulations, and observational devices, has reached unprecedented levels. This surge in data generation and the demand for proper storage has been met with extensive research and development in distributed storage systems, facilitating the scalable housing of these voluminous datasets while enabling analytical processes. Nonetheless, the extraction of meaningful insights from these datasets, especially in the context of low-latency/ interactive analytics, poses a formidable challenge. This arises from the persistent gap between the processing capacity of distributed systems and their ever-expanding storage capabilities. Moreover, the interactive querying of these datasets is hindered by disk I/O, redundant network communications, recurrent hotspots, transient surges of user interest over limited geospatial regions, particularly in systems that concurrently serve multiple users. In environments where interactive querying is paramount, such as visualization systems, addressing these challenges becomes imperative. This dissertation delves into the intricacies of enabling interactive analytics over large-scale spatiotemporal datasets. My research efforts are centered around the conceptualization and implementation of a scalable storage, indexing, and caching framework tailored specifically for spatiotemporal data access. The research aims to create frameworks to facilitate fast query analytics over diverse data-types ranging from point, vector, and raster datasets. The frameworks implemented are characterized by its lightweight nature, residence primarily in memory, and their capacity to support model-driven extraction of insights from raw data or dynamic reconstruction of compressed/ partial in-memory data fragments with an acceptable level of accuracy. This approach effectively helps reduce the memory footprint of cached data objects and also mitigates the need for frequent client-server communications. Furthermore, we investigate the potential of leveraging various transfer learning techniques to improve the turn-around times of our memory-resident deep learning models, given the voluminous nature of our datasets, while maintaining good overall accuracy over its entire spatiotemporal domain. Additionally, our research explores the extraction of insights from high-dimensional datasets, such as satellite imagery, within this framework. The dissertation is also accompanied by empirical evaluations of our frameworks as well as the future directions and anticipated contributions in the domain of interactive analytics over large-scale spatiotemporal datasets, acknowledging the evolving landscape of data analytics where analytics frameworks increasingly rely on compute-intensive machine learning models.
  • ItemOpen Access
    Subnetwork ensembles
    (Colorado State University. Libraries, 2023) Whitaker, Timothy J., author; Whitley, Darrell, advisor; Anderson, Charles, committee member; Krishnaswamy, Nikhil, committee member; Kirby, Michael, committee member
    Neural network ensembles have been effectively used to improve generalization by combining the predictions of multiple independently trained models. However, the growing scale and complexity of deep neural networks have led to these methods becoming prohibitively expensive and time consuming to implement. Low-cost ensemble methods have become increasingly important as they can alleviate the need to train multiple models from scratch while retaining the generalization benefits that traditional ensemble learning methods afford. This dissertation introduces and formalizes a low-cost framework for constructing Subnetwork Ensembles, where a collection of child networks are formed by sampling, perturbing, and optimizing subnetworks from a trained parent model. We explore several distinct methodologies for generating child networks and we evaluate their efficacy through a variety of ablation studies and established benchmarks. Our findings reveal that this approach can greatly improve training efficiency, parametric utilization, and generalization performance while minimizing computational cost. Subnetwork Ensembles offer a compelling framework for exploring how we can build better systems by leveraging the unrealized potential of deep neural networks.
  • ItemEmbargo
    Automated extraction of access control policy from natural language documents
    (Colorado State University. Libraries, 2023) Alqurashi, Saja, author; Ray, Indrakshi, advisor; Ray, Indrajit, committee member; Malaiya, Yashwant, committee member; Simske, Steve, committee member
    Data security and privacy are fundamental requirements in information systems. The first step to providing data security and privacy for organizations is defining access control policies (ACPs). Security requirements are often expressed in natural languages, and ACPs are embedded in the security requirements. However, ACPs in natural language are unstructured and ambiguous, so manually extracting ACPs from security requirements and translating them into enforceable policies is tedious, complex, expensive, labor-intensive, and error-prone. Thus, the automated ACPs specification process is crucial. In this thesis, we consider the Next Generation Access Control (NGAC) model as our reference formal access control model to study the automation process. This thesis addresses the research question: How do we automatically translate access control policies (ACPs) from natural language expression to the NGAC formal specification? Answering this research question entails building an automated extraction framework. The pro- posed framework aims to translate natural language ACPs into NGAC specifications automatically. The primary contributions of this research are developing models to construct ACPs in NGAC specification from natural language automatically and generating a realistic synthetic dataset of access control policies sentences to evaluate the proposed framework. Our experimental results are promising as we achieved, on average, an F1-score of 93 % when identifying ACPs sentences, an F1-score of 96 % when extracting NGAC relations between attributes, and an F1-score of 96% when extracting user attribute and 89% for object attribute from natural language access control policies.
  • ItemOpen Access
    Machine learning-based phishing detection using URL features: a comprehensive review
    (Colorado State University. Libraries, 2023) Asif, Asif Uz Zaman, author; Ray, Indrakshi, advisor; Shirazi, Hossein, advisor; Ray, Indrajit, committee member; Wang, Haonan, committee member
    In a social engineering attack known as phishing, a perpetrator sends a false message to a victim while posing as a trusted representative in an effort to collect private information such as login passwords and financial information for personal gain. To successfully carry out a phishing attack, fraudulent websites, emails, and messages that are counterfeit are utilized to trick the victim. Machine learning appears to be a promising technique for phishing detection. Typically, website content and Unified Resource Locator (URL) based features are used. However, gathering website content features requires visiting malicious sites, and preparing the data is labor-intensive. Towards this end, researchers are investigating if URL-only information can be used for phishing detection. This approach is lightweight and can be installed at the client's end, they do not require data collection from malicious sites and can identify zero-day attacks. We conduct a systematic literature review on URL-based phishing detection. We selected recent papers (2018 --) or if they had a high citation count (50+ in Google Scholar) that appeared in top conferences and journals in cybersecurity. This survey will provide researchers and practitioners with information on the current state of research on URL-based website phishing attack detection methodologies. The results of this study show that, despite the lack of a centralized dataset, this is beneficial because it prevents attackers from seeing the features that classifiers employ. However, the approach is time-consuming for researchers. Furthermore, for algorithms, both machine learning and deep learning algorithms can be utilized since they have very good classification accuracy, and in this work, we found that Random Forest and Long Short-Term Memory are good choices of algorithms. Using task-specific lexical characteristics rather than concentrating on the number of features is essential for this work because feature selection will impact how accurately algorithms will detect phishing URLs.
  • ItemOpen Access
    Neuralator 5000: exploring and enhancing the BOLD5000 fMRI dataset to improve the robustness of artificial neural networks
    (Colorado State University. Libraries, 2023) Pickard, William Augustus, author; Blanchard, Nathaniel, advisor; Anderson, Chuck, committee member; Thomas, Michael, committee member
    Artificial neural networks (ANNs) originally drew their inspiration from biological constructs. Despite the rapid development of ANNs and their seeming divergence from their biological roots, research using representational similarity analysis (RSA) shows a connection between the internal representations of artificial and biological neural networks. To further investigate this connection, human subject functional magnetic resonance imaging (fMRI) studies using stimuli drawn from common ANN training datasets are being compiled. One such dataset is the BOLD5000, which is composed of fMRI data from four subjects who were presented with stimuli selected from the ImageNet, Common Objects in Context (COCO), and Scene UNderstanding (SUN) datasets. An important area where this data can be fruitful is in improving ANN model robustness. This work seeks to enhance the BOLD5000 dataset and make it more accessible for future ANN research by re-segmenting the data from the second release of the BOLD5000 into new ROIs using the vcAtlas and visfAtlas visual cortex atlases, generating representational dissimilarity matrices (RDMs) for all ROIs, and providing a new, biologically-inspired set of supercategory labels specific to the ImageNet dataset. To demonstrate the utility of these new BOLD5000 derivatives, I compare human fMRI data to RDMs derived from the activations of four prominent vision ANNs: AlexNet, ResNet-50, MobileNetV2, and EfficientNet B0. The results of this analysis show that the old, less-advanced AlexNet has a higher neuro-similarity than the much more recent, and technically better-performing models. These results are further confirmed through the use of Fiedler vector analysis on the RDMs, which shows a reduction in the separability of the internal representations of the biologically inspired supercategories.
  • ItemOpen Access
    Pandemic perceptions: analyzing sentiment in COVID-19 tweets
    (Colorado State University. Libraries, 2023) Bashir, Shadaab Kawnain, author; Ray, Indrakshi, advisor; Shirazi, Hossein, advisor; Wang, Haonan, committee member
    Social media, particularly Twitter, became the center of public discourse during the COVID-19 global crisis, shaping narratives and perceptions. Recognizing the critical need for a detailed examination of this digital interaction, our research dives into the mechanics of pandemic-related Twitter conversations. This study seeks to understand the many dynamics and effects at work in disseminating COVID-19 information by analyzing and comparing the response patterns displayed by tweets from influential individuals and organizational accounts. To meet the research goals, we gathered a large dataset of COVID-19-related Tweets during the pandemic, which was then meticulously manually annotated. In this work, task-specific transformers and LLM models are used to provide tools for analyzing the digital effects of COVID-19 on sentiment analysis. By leveraging domain-specific models RoBERTa[Twitter] fine-tuned on social media data, this research improved performance in critical task of sentiment analysis. Investigation demonstrates individuals express subjective feelings more frequently compared to organizations. Organizations, however, disseminate more pandemic content in general.
  • ItemOpen Access
    ViennaRNA - optimizing a real-world RNA folding program
    (Colorado State University. Libraries, 2023) Save, Vidit V., author; Rajopadhye, Sanjay, advisor; Pallickara, Shrideep, committee member; Montgomery, Taiowa, committee member
    RNA folding is the dynamic process of intra-molecular interactions that makes a linear RNA molecule acquire a secondary structure. Predicting the acquired secondary structure is critical for gene regulation, disease characterization, and improving drug design. ViennaRNA is a highly utilized tool in the synthetic biology community to predict RNA secondary structures. This package is constantly updated to add new features and uses techniques like vectorization to boost its single-core performance. However, reviewing the package revealed that adopting known HPC optimizations to the code base could significantly improve the current performance. Optimizing a program with over 10k lines of code creates several software engineering challenges. Hence, toy kernels that mimic the code's behavior were initially used to explore possible optimizations. These kernels helped save compilation time and boil down the optimization process for the multi-branch loop prediction, a part of RNAfold, to five simple steps. On applying the optimizations described in this thesis, a 2X speedup can be observed for the entire program with a 4.2X speedup for the optimized part of the code. Using Intel's Roofline toolkit shows that applying these optimizations helped achieve cache utilization close to the theoretical L1 bandwidth of the machine. As a part of this thesis, incremental patches were created to integrate optimizations without disrupting the code base while ensuring the program's correctness.
  • ItemOpen Access
    CPS security testbed: requirement analysis, prototype design and protection framework
    (Colorado State University. Libraries, 2023) Talukder, Md Rakibul Hasan, author; Ray, Indrajit, advisor; Malaiya, Yashwant, committee member; Vijayasarathy, Leo, committee member
    Testbeds are a practical way to perform security exercises on cyber physical systems (CPS) to understand vulnerabilities and the progression/impact of cyber-attacks. However, it is challenging to replicate a large CPS, such as nuclear power plant or an electrical power grid, within the confines of a laboratory that would allow security experiments to be carried out. Thus, software-based simulations are getting increasingly popular as opposed to hardware-in-the-loop based simulations for CPS that form a critical infrastructure. Unfortunately, a software-based CPS testbed oriented towards security-centric experiments requires a careful re-examination of requirements and architectural design different from a CPS testbed for non-security related experiments. On a security-focused testbed there is a need to run real attack scripts for red-teaming/blue-teaming exercises, which are, in the strictest sense of the term, malicious in nature. Thus, there is a need to protect the testbed itself from these attack experiments that have the potential to go awry. The overall effect of an exploit on the whole system or vulnerabilities at communication channels needs to be particularly explored while building a simulator for a security-centric CPS. Besides, when multiple experiments are conducted on the same testbed, there is a need to maintain isolation among these experiments so that no experiment can accidentally or maliciously compromise others and affect the fidelity of those results. Specific security experiment-related supports are essential when designing such a testbed but integrating a software-based simulator within the testbed to provide necessary experiment support is challenging. In this thesis, we make three contributions. First, we present the design of an ideal testbed based on a set of requirements and supports that we have identified, focusing specifically on security experiment as the primary use case. Next, following these requirements analysis, we integrate a software-based simulator (Generic Pressurized Water Reactor) into a testbed design by modifying the implementation architecture to allow the execution of attack experiments on different networking architectures and protocols. Finally, we describe a novel security architecture and framework to ensure the protection of security-related experiments on a CPS testbed.
  • ItemOpen Access
    Application of the neural data transformer to non-autonomous dynamical systems
    (Colorado State University. Libraries, 2023) Mifsud, Domenick M., author; Ortega, Francisco R., advisor; Anderson, Charles, advisor; Thomas, Micheal, committee member; Barreto, Armando, committee member
    The Neural Data Transformer (NDT) is a novel non-recurrent neural network designed to model neural population activity, offering faster inference times and the potential to advance real-time applications in neuroscience. In this study, we expand the applicability of the NDT to non-autonomous dynamical systems by investigating its performance on modeling data from the Chaotic Recurrent Neural Network (RNN) with delta pulse inputs. Through adjustments to the NDT architecture, we demonstrate its capability to accurately capture non-autonomous neural population dynamics, making it suitable for a broader range of Brain-Computer Inter-face (BCI) control applications. Additionally, we introduce a modification to the model that enables the extraction of interpretable inferred inputs, further enhancing the utility of the NDT as a powerful and versatile tool for real-time BCI applications.
  • ItemOpen Access
    Intentional microgesture recognition for extended human-computer interaction
    (Colorado State University. Libraries, 2023) Kandoi, Chirag, author; Blanchard, Nathaniel, advisor; Krishnaswamy, Nikhil, advisor; Soto, Hortensia, committee member
    As extended reality becomes more ubiquitous, people will more frequently interact with computer systems using gestures instead of peripheral devices. However, previous works have shown that using traditional gestures (pointing, swiping, etc.) in mid-air causes fatigue, rendering them largely unsuitable for long-term use. Some of the same researchers have promoted "microgestures"---smaller gestures requiring less gross motion---as a solution, but to date there is no dataset of intentional microgestures available to train computer vision algorithms for use in downstream interactions with computer systems such as agents deployed on XR headsets. As a step toward addressing this challenge, I present a novel video dataset of microgestures, classification results from a variety of ML models showcasing the feasibility (and difficulty) of detecting these fine-grained movements, and discuss the challenges in developing robust recognition of microgestures for human-computer interaction.
  • ItemEmbargo
    Collaborating with artists to design additional multimodal and unimodal interaction techniques for three-dimensional drawing in virtual reality
    (Colorado State University. Libraries, 2023) Sullivan, Brian T., author; Ortega, Francisco, advisor; Ghosh, Sudipto, committee member; Tornatzky, Cyane, committee member; Barrera Machuca, Mayra, committee member; Batmaz, Anil Ufuk, committee member
    Although drawing is an old and common mode of human creativity and expression, virtual reality (VR) has presented an opportunity for a novel form of drawing. Instead of representing three-dimensional objects with marks on a two-dimensional surface, VR permits people to create three-dimensional (3D) drawings in midair. It remains unknown, however, what would constitute an optimal interface for 3D drawing in VR. This thesis helps to answer this question by describing a co-design study conducted with artists to identify desired multimodal and unimodal interaction techniques to incorporate into user interfaces for 3D VR drawing. Numerous modalities and interaction techniques were proposed in this study, which can inform future research into interaction techniques for this developing medium.
  • ItemEmbargo
    Machine learning and deep learning applications in neuroimaging for brain age prediction
    (Colorado State University. Libraries, 2023) Vafaei, Fereydoon, author; Anderson, Charles, advisor; Kirby, Michael, committee member; Blanchard, Nathaniel, committee member; Burzynska, Agnieszka, committee member
    Machine Learning (ML) and Deep Learning (DL) are now considered as state-of-the-art assistive AI technologies that help neuroscientists, neurologists and medical professionals with early diagnosis of neurodegenerative diseases and cognitive decline as a consequence of unhealthy brain aging. Brain Age Prediction (BAP) is the process of estimating a person's biological age using Neuroimaging data, and the difference between the predicted age and the subject's chronological age, known as Delta, is regarded as a biomarker for healthy versus unhealthy brain aging. Accurate and efficient BAP is an important research topic, and hence ML/DL methods have been developed for this task. There are different modalities of Neuroimaging such as Magnetic Resonance Imaging (MRI) that have been used for BAP in the past. Diffusion Tensor Imaging (DTI) is an advanced quantitative Neuroimaging technology that gives insight into microstructure of White Matter tracts that connect different parts of the brain to function properly. DTI data is high-dimensional, and age-related microstructural changes in White Matter include non-linear patterns. In this study, we perform a series of analytical experiments using ML and DL methods to investigate the applicability of DTI data for BAP. We also investigate which Diffusivity Parameters, which are DTI metrics that reflect direction and magnitude of diffusion of water molecules in the brain, are relevant for BAP as a Supervised Learning task. Moreover, we propose, implement, and analyze a novel methodology that can detect age-related anomalies (high Deltas), and can overcome some of the major and fundamental limitations of the current supervised approach for BAP, such as "Chronological Age Label Inconsistency". Our proposed methodology, which combines Unsupervised Anomaly Detection (UAD) and supervised BAP, focuses on addressing a fundamental challenge in BAP which is how to interpret a model's error. Should a researcher interpret a model's error as an indication of unhealthy brain aging or the model's poor performance that should be eliminated? We argue that the underlying cause of this problem is the inconsistency of chronological age labels as the ground truth of the Supervised Learning task, which is the common basis of training ML/DL models. Our Unsupervised Learning methods and findings open a new possibility to detect irregularities and abnormalities in the aging brain using DTI scans, independent of inconsistent chronological age labels. The results of our proposed methodology show that combining label-independent UAD and supervised BAP provides a more reliable and methodical way for error analysis than the current supervised BAP approach when it is used in isolation. We also provide visualization and explanations on how our ML/DL methods make their decisions for BAP. Explainability and generalization of our ML/DL models are two important aspects of our study.