Browsing by Author "Beveridge, Ross, committee member"

Now showing 1 - 20 of 20

Open Access
A streamlined bridge inspection framework utilizing unmanned aerial vehicles (UAVs)
(Colorado State University. Libraries, 2019) Perry, Brandon J., author; Guo, Yanlin, advisor; Atadero, Rebecca, committee member; van de Lindt, John W., committee member; Beveridge, Ross, committee member
The lack of quantitative measures and location information for instances of damage results in human-based bridge inspections that are variable and subjective in nature. With bridge owners and managers tasked with making major maintenance/repair decisions with inadequate funding and resources, it is appealing to develop a transparent bridge inspection and evaluation system that integrates field inspection and documentation of damage with quantitative measures and geo-referenced locations in a holistic process. A new, streamlined bridge inspection framework based on unmanned aerial vehicles (UAVs) is proposed to improve the efficiency, cost-effectiveness, and objectivity of these inspections while enhancing the safety of inspectors. Since the current bridge inspection practices use a component based structural rating system, the new UAV-based bridge inspection system should also follow a component-wise damage evaluation system to enable the seamless adoption of this new technology into practice. To provide bridge managers/owners with the streamlined decision-making support, this new system uniquely integrates UAV-based field inspection, automated damage/defect identification, and establishment of an element-wise As-Built Building Information Model (AB-BIM) for the damage documentation in a holistic manner. In this framework, a UAV platform carrying visual sensors first collects data for identifying defects (i.e. cracks, spalling and scaling of concrete). Next, an automated damage detection algorithm is developed to quickly extract quantitative damage information (i.e. type, size, amount, and location) from the data. By using UAV-enabled photogrammetry and unsupervised machine learning techniques, this system can automatically segment the bridge elements (i.e. beam, girders, deck, etc.) from a 3D point-cloud with minimal user input. In the end, the damage information is mapped to the corresponding structural components of the bridge and readily visualized in the AB-BIM. The documented element-wise damage information with quantitative measures in conjunction with the 3D visualization function in the proposed system can provide bridge managers with a transparent condition evaluation and a one-stop decision making support which can greatly ease the planning of repair/maintenance. The feasibility of this approach is demonstrated using a case study of a Colorado bridge.
Open Access
APE-V: athlete performance evaluation using video
(Colorado State University. Libraries, 2021) Roygaga, Chaitanya, author; Blanchard, Nathaniel, advisor; Beveridge, Ross, committee member; Reiser, Raoul, committee member
Athletes typically undergo regular evaluations by trainers and coaches to assess performance and injury risk. One of the most popular movements to examine is the vertical jump — a sport-independent means of assessing both lower extremity risk and power. Specifically, maximal effort countermovement and drop jumps performed on bilateral force plates provide a wealth of metrics; however, detailed evaluation of this movement requires specialized equipment (force plates) and trained experts to interpret results, limiting its use. Computer vision techniques applied to videos of such movements are a less expensive alternative for extracting such metrics. Blanchard et al. collected a dataset of 89 athletes performing these movements and showcased how OpenPose could be applied to the data. However, athlete error calls into question 46.2% of movements — in these cases, an expert assessor would have the athlete redo the movement to eliminate the error. Here, I augmented Blanchard et al. with expert labels of error and established benchmark performance on automatic error identification. In total, 14 different types of errors were identified by trained annotators. My benchmark models identified errors with an F1 score of 0.710 and a Kappa of 0.457 (Kappa measures accuracy over chance).
Open Access
Applications of topological data analysis to natural language processing and computer vision
(Colorado State University. Libraries, 2022) Garcia, Jason S., author; Krishnaswamy, Nikhil, advisor; Adams, Henry, committee member; Beveridge, Ross, committee member
Topological Data Analysis (TDA) uses ideas from topology to study the "shape" of data. It provides a set of tools to extract features, such as holes, voids, and connected components, from complex high-dimensional data. This thesis presents an introductory exposition of the mathematics underlying the two main tools of TDA: Persistent Homology and the MAPPER algorithm. Persistent Homology detects topological features that persist over a range of resolutions, capturing both local and global geometric information. The MAPPER algorithm is a visualization tool that provides a type of dimensional reduction that preserves topological properties of the data by projecting them onto lower dimensional simplicial complexes. Furthermore, this thesis explores recent applications of these tools to natural language processing and computer vision. These applications are divided into two main approaches: In the first approach, TDA is used to extract features from data that is then used as input for a variety of machine learning tasks, like image classification or visualizing the semantic structure of text documents. The second approach, applies the tools of TDA to the machine learning algorithms themselves. For example, using MAPPER to study how structure emerges in the weights of a trained neural network. Finally, the results of several experiments are presented. These include using Persistent Homology for image classification, and using MAPPER to visual the global structure of these data sets. Most notably, the MAPPER algorithm is used to visualize vector representations of contextualized word embeddings as they move through the encoding layers of the BERT-base transformer model.
Open Access
Assessing usability of full-body immersion in an interactive virtual reality environment
(Colorado State University. Libraries, 2020) Raikwar, Aditya R., author; Ortega, Francisco R., advisor; Beveridge, Ross, committee member; Stephens, Jaclyn, committee member; Smith, Charles, committee member
Improving immersion and playability has a direct impact on the effectiveness of certain Virtual Reality applications. This project looks at understanding how to develop an immersive soccer application with the intention to measure skills, particularly for the use of assessment and health promotion. This project will show the requirements to create a top-down immersive experience with commodity devices. The particular system serves the simulation of a soccer training environment to evade opponents, pass to teammates, and score goals with the objective of measuring the difficulty of single, double, and triple tasks. It is expected that the performance will go down as the level of tasks increases. This hypothesis is extremely relevant as it provides a system that could serve as an assessment tool for people with concussions to return to play (with an OK by a physician) or to promote exercise to non-athletes. This thesis provides all the necessary steps to explain the high-level details of highly immersive applications while providing a future-path for human-subject experiments.
Open Access
Automated deep learning architecture design using differentiable architecture search (DARTS)
(Colorado State University. Libraries, 2019) Sharma, Kartikay, author; Anderson, Chuck, advisor; Beveridge, Ross, committee member; Kirby, Michael, committee member
Creating neural networks by hand is a slow trial-and-error based process. Designing new architectures similar to GoogleNet or FractalNets, which use repeated tree-based structures, is highly likely to be inefficient and sub-optimal because of the large number of possibilities for composing such structures. Recently, neural architecture search algorithms have been able to automate the process of architecture design and have often attained state-of-the-art performances on CIFAR-10, ImageNet and Penn Tree Bank datasets. Even though the search time has been reduced to tens of GPU hours from tens of thousands of GPU hours, most search algorithms rely on additional controllers and hypernetworks to generate architecture encoding or predict weights for sampled architectures. These controllers and hypernetworks might require optimal structure when deployed on a new task on a new dataset. And since this is done by hand, the problem of architecture search is not really solved. Differentiable Architecture Search (DARTS) avoids this problem by using gradient descent methods. In this work, the DARTS algorithm is studied under various conditions and search hyperparameters. DARTS is applied to CIFAR-10 to check reproducibility of the original results. It is also tested in a new setting — on the CheXpert dataset — to discover new architectures and is compared to a baseline DenseNet121 model. The architectures searched using DARTS achieve better performance on the validation set than the baseline model.
Open Access
Automatic prediction of interest point stability
(Colorado State University. Libraries, 2009) Comer, Thomson H., author; Draper, Bruce A. (Bruce Austin), 1962-, advisor; Monnier, Patrick, committee member; Beveridge, Ross, committee member
Many computer vision applications depend on interest point detectors as a primary means of dimensionality reduction. While many experiments have been done measuring the repeatability of selective attention algorithms [MTS+05, BL02, CJ02, MP07, SMBI98], we are not aware of any method for predicting the repeatability of an individual interest point at runtime. In this work, we attempt to predict the individual repeatability of a set of 106 interest points produced by Lowe's SIFT algorithm [Low03], Mikolajczyk's Harris-Affine [Mik02], and Mikolajczyk and Schmid's Hessian-Affine [MS04]. These algorithms were chosen because of their performance and popularity. 17 relevant attributes are recorded at each interest point, including eigenvalues of the second moment matrix, Hessian matrix, and Laplacian-of-Gaussian score. A generalized linear model is used to predict the repeatability of interest points from their attributes. The relationship between interest point attributes proves to be weak, however the repeatability of an individual interest point can to some extent be influenced by attributes. A 4% improvement ofmean interest point repeatability is acquired through two related methods: the addition of five new thresholding decisions and through selecting the N best interest points as predicted by a GLM of the logarithm of all 17 interest points. A similar GLM with a smaller set of author-selected attributes has comparable performance. This research finds that improving interest point repeatability remains a hard problem, with an improvement of over 4% unlikely using the current methods for interest point detection. The lack of clear relationships between interest point attributes and repeatability indicates that there is a hole in selective attention research that may be attributable to scale space implementation.
Open Access
Classification using out of sample testing of neural networks and Siamese-like neural network for handwritten characters
(Colorado State University. Libraries, 2020) Yeluri, Sri Sagar Abhishek, author; Anderson, Charles W., advisor; Beveridge, Ross, committee member; Hess, Ann, committee member
In a world where Machine Learning Algorithms in the field of Image Processing is being developed at a rapid pace, a developer needs to have a better insight into all the algorithms to choose one among them for their application. When an algorithm is published, the developers of the algorithm compare their algorithm with already available well-performing algorithms and claim their algorithm outperforms all or the majority of other algorithms in terms of accuracy. However, adaptability is a very important aspect of Machine Learning which is usually not mentioned in their papers. Adaptability is the ability of a Machine Learning algorithm to work reliably in the real world, despite the change in the environmental factors in comparison to the environment in which data used for training is recorded. A machine learning algorithm that can give good results only on the dataset has no practical applications. In real life, the application of the algorithm increases only when it is more adaptable in nature. A few other aspects that are important in choosing the right algorithm for an application are consistency, time and resource utilization and the availability of human intervention. A person choosing amongst a list of algorithms for an application will be able to make a wise decision if given additional information, as each application varies from one another and needs a different set of characteristics of an algorithm for it to be well received. We have implemented and compared three Machine Learning algorithms used in image processing, on two different datasets and compare the results. We observe that certain algorithms, even though better than others in terms of accuracy on paper, fall behind when tested in real-world datasets. We put forward a few suggestions that if followed will simplify the selection of an algorithm for a specific purpose.
Open Access
Consistent hidden Markov models
(Colorado State University. Libraries, 2014) Narayana Rao Gari, Pradyumna Kumar, author; Draper, Bruce A., advisor; Beveridge, Ross, committee member; Peterson, Chris, committee member
Activity recognition in Computer Vision involves recognizing the appearance of an object of interest along with its action, and its relation to the scene or other important objects. There exist many methods that give this information about an object. However, these methods are noisy and are independent of each other. So, the mutual information between the labels is lost. For example, an object might be predicted to be a tree, whereas its action might be predicted as walk. But, trees can't walk. However, the compositional structure of the events is reflected by the compositional structure of natural language. The object of interest is the predicate, usually a noun, the action is the verb, and its relation to the scene may be a preposition or adverb. The lost mutual information that says that trees can't walk is present in natural language. The contribution of this thesis is a method of visual information fusion based on exploiting the mutual information from Natural language databases. Although Hidden Markov Models (HMM) are the traditional way to smooth noisy stream of data by integrating information across time, they can't account for the lost mutual information. This thesis proposes an extension to HMM (Consistent HMM) that can integrate visual information to the lost mutual information by exploiting the knowledge from language databases. Consistent HMM performs better than other state of the art HMMs on synthetic data generated to simulate the real world behavior. Although the performance gain of integrating the knowledge from language databases both during training phase and run-time is better, when considered individually, the performance gain is more when the knowledge is integrated during run-time than training.
Open Access
Deep learning for bioinformatics sequences: RNA basecalling and protein interactions
(Colorado State University. Libraries, 2024) Neumann, Don, author; Ben-Hur, Asa, advisor; Beveridge, Ross, committee member; Blanchard, Nathaniel, committee member; Reddy, Anireddy, committee member
In the interdisciplinary field of bioinformatics, sequence data for biological problems comes in many different forms. This ranges from proteins, to RNA, to the ionic current for a strand of nucleotides from an Oxford Nanopore Technologies sequencing device. This data can be used to elucidate the fundamentals of biological processes on many levels, which can help humanity with everything from drug design to curing disease. All of our research focuses on biological problems encoded as sequences. The main focus of our research involves Oxford Nanopore Technology sequencing devices which are capable of directly sequencing long read RNA strands as is. We first concentrate on improving the basecalling accuracy for RNA, and have published a paper with a novel architecture achieving state-of-the-art performance. The basecalling architecture uses convolutional blocks, each with progressively larger kernel sizes which improves accuracy for the noisy nature of the data. We then describe ongoing research into the detection of post-transcriptional RNA modifications in nanopore sequencing data. Building on our basecalling research, we are able to discern modifications with read level resolution. Our work will facilitate research into the detection of N6-methyladeosine (m6A) while also furthering progress in the detection of other post-transcriptional modifications. Finally, we recount our recently accepted paper regarding protein-protein and host-pathogen interaction prediction. We performed experiments demonstrating faulty experimental design for interaction prediction which have plagued the field, giving the faulty impression the problem has been solved. We then provide reasoning and recommendations for future work.
Open Access
EgoRoom: egocentric 3D pose estimation through multi-coordinates heatmaps
(Colorado State University. Libraries, 2022) Jung, Changsoo, author; Blanchard, Nathaniel, advisor; Beveridge, Ross, committee member; Clegg, Benjamin, committee member
Recent head-mounted virtual reality (VR) devices include fisheye lenses oriented to users' bodies, which enable full body pose estimation from video. However, traditional joint detection methods fail under this use case because fisheye lenses make joint depth information ambiguous, causing body parts to be self-occluded by the distorted torso. To resolve these problems, we propose a novel architecture, EgoRoom, that uses three different types of heatmaps in 3D to predict body joints, even if they are self-occluded. Our approach consists of three main modules. The first module transmutes the fisheye image into feature embeddings via an attention mechanism. Then, the second module utilizes three decoder branches to convert those features into a 3D coordinate system, with each branch corresponding to the xy, yz, and xz planes. Finally, the third module combines the three decoder heatmaps into the predicted 3D pose. Our method achieves state-of-the-art results on the xR-EgoPose dataset.
Open Access
Evaluating cluster quality for visual data
(Colorado State University. Libraries, 2013) Wigness, Maggie, author; Draper, Bruce, advisor; Beveridge, Ross, committee member; Howe, Adele, committee member; Peterson, Chris, committee member
Digital video cameras have made it easy to collect large amounts of unlabeled data that can be used to learn to recognize objects and actions. Collecting ground-truth labels for this data, however, is a much more time consuming task that requires human intervention. One approach to train on this data, while keeping the human workload to a minimum, is to cluster the unlabeled samples, evaluate the quality of the clusters, and then ask a human annotator to label only the clusters believed to be dominated by a single object/action class. This thesis addresses the task of evaluating the quality of unlabeled image clusters. We compare four cluster quality measures (and a baseline method) using real-world and synthetic data sets. Three of these measures can be found in the existing data mining literature: Dunn Index, Davies-Bouldin Index and Silhouette Width. We introduce a novel cluster quality measure as the fourth measure, derived from recent advances in approximate nearest neighbor algorithms from the computer vision literature, called Proximity Forest Connectivity (PFC). Experiments on real-world data show that no cluster quality measure performs "best" on all data sets; however, our novel PFC measure is always competitive and results in more top performances than any of the other measures. Results from synthetic data experiments show that while the data mining measures are susceptible to over-clustering typically required of visual data, PFC is much more robust. Further synthetic data experiments modeling features of visual data show that Davies-Bouldin is most robust to large amounts of class-specific noise. However, Davies-Bouldin, Silhouette and PFC all perform well in the presence of data with small amounts of class-specific noise, whereas Dunn struggles to perform better than random.
Open Access
Hierarchical cluster guided labeling: efficient label collection for visual classification
(Colorado State University. Libraries, 2015) Wigness, Maggie, author; Draper, Bruce, advisor; Beveridge, Ross, committee member; Howe, Adele, committee member; Peterson, Chris, committee member
Visual classification is a core component in many visually intelligent systems. For example, recognition of objects and terrains provides perception during path planning and navigation tasks performed by autonomous agents. Supervised visual classifiers are typically trained with large sets of images to yield high classification performance. Although the collection of raw training data is easy, the required human effort to assign labels to this data is time consuming. This is particularly problematic in real-world applications with limited labeling time and resources. Techniques have emerged that are designed to help alleviate the labeling workload but suffer from several shortcomings. First, they do not generalize well to domains with limited a priori knowledge. Second, efficiency is achieved at the cost of collecting significant label noise which inhibits classifier learning or requires additional effort to remove. Finally, they introduce high latency between labeling queries, restricting real-world feasibility. This thesis addresses these shortcomings with unsupervised learning that exploits the hierarchical nature of feature patterns and semantic labels in visual data. Our hierarchical cluster guided labeling (HCGL) framework introduces a novel evaluation of hierarchical groupings to identify the most interesting changes in feature patterns. These changes help localize group selection in the hierarchy to discover and label a spectrum of visual semantics found in the data. We show that employing majority group-based labeling after selection allows HCGL to balance efficiency and label accuracy, yielding higher performing classifiers than other techniques with respect to labeling effort. Finally, we demonstrate the real-world feasibility of our labeling framework by quickly training high performing visual classifiers that aid in successful mobile robot path planning and navigation.
Open Access
Image feature associations via local semantic structure
(Colorado State University. Libraries, 2010) Parrish, Nicholas James, author; Draper, Bruce A., advisor; Beveridge, Ross, committee member; Troup, Lucy, committee member
Research in the field of object recognition suffers from two distinct weaknesses that limits its effectiveness in natural environments. The first is that this research tends to rely on labeled training images, or other forms of supervision, to learn object models and recognize these models in novel images, thus preventing the learning of objects that are not labeled by humans. The second is that such systems tend to assume that the goal is to recognize a single, dominant foreground object. This research implements a different method of object recognition that learns, with- out supervision, which object(s) are in natural scenes. This approach uses the semantic co-occurance information of local image features to form object models from groups of image features, which shall be called percepts. These percepts are then used to recognize objects in novel images. It will be shown that this approach is capable of learning object categories without supervision and recognition in complex multi-object scenes. It will also be shown that this approach out-performs a nearest-neighbor scene recognition approach.
Open Access
Machine learning for computer aided programming: from stochastic program repair to verifiable program equivalence
(Colorado State University. Libraries, 2022) Kommrusch, Steve, author; Pouchet, Louis-Noël, advisor; Anderson, Charles, advisor; Beveridge, Ross, committee member; Azimi-Sadjadi, Mahmood, committee member
Computer programming has benefited from a virtuous cycle of innovation as improvements in computer hardware and software make higher levels of program abstraction and complexity possible. Recent advances in the field of machine learning, including neural network models for translating and answering questions about human language, can also be applied to computer programming itself. This thesis aims to make progress on the problem of using machine learning to improve the quality and robustness of computer programs by contributing new techniques for representation of programming problems, applying neural network models to code, and training procedures to create systems useful for computer aided programming. We first present background and preliminary studies of machine learning concepts. We then present a system that directly produces source code for automatic program repair which advances the state of the art by using a learned copy mechanism during generation. We extend a similar system to tune its learning for security vulnerability repair. We then develop a system for program equivalence which generates deterministically checkable output for equivalent programs. For this work we detail our contribution to the popular OpenNMT-py GitHub project used broadly for neural machine translation. Finally, we show how the deterministically checkable output can provide self-supervised sample selection which improves the performance and generalizability of the system. We develop breadth metrics to demonstrate that the range of problems addressed is representative of the problem space, while demonstrating that our deep neural networks generate proposed solutions which can be verified in linear time. Ultimately, our work provides promising results in multiple areas of computer aided programming which allow human developers to produce quality software more effectively.
Open Access
Motion segmentation for feature association
(Colorado State University. Libraries, 2010) Pace, Weston Clement, author; Draper, Bruce, advisor; Beveridge, Ross, committee member; Hayne, Stephen, committee member
In a feature based system physical objects are represented as spatial groups of features. Systems which hope to operate on objects must make associations between features that belong on the same physical object. This paper segments interest points in individual frames of an image sequence using motion models based on image transformations. Experiments evaluate the associations made by these segments against ground truth data. We give an improved version of the existing algorithm which can lead to easier threshold selection in some systems although the ideal threshold is shown to depend on the goal of the segmentation. Lastly we show that the underlying motion of the object is not the only factor in determining the performance of the segmentation.
Open Access
One-shot learning with pretrained convolutional neural network
(Colorado State University. Libraries, 2019) Yu, Zhixian, author; Draper, Bruce, advisor; Beveridge, Ross, committee member; Peterson, Chris, committee member
Recent progress in convolutional neural networks and deep learning has revolutionized the image classification field, and computers can now classify images with a very high accuracy. However, unlike the human vision system which efficiently recognizes a new object after seeing a similar one, recognizing new classes of images requires a time- and resource-consuming process of retraining a neural network due to several restrictions. Since a pretrained neural network has seen a large amount of training data, it may be generalized to effectively and efficiently recognize new classes considering it may extract patterns from training images. This inspires some research in one-shot learning, which is the process of learning to classify a novel class through one training image from the novel class. One-shot learning can help expand the use of a trained convolutional neural network without costly model retraining. In addition to the practical application of one-shot learning, it is also important to understand how a convolutional neural network supports one-shot learning. More specifically, how does the feature space structure to support one-shot learning? This can potentially help us better understand the mechanisms of convolutional neural networks. This thesis proposes an approximate nearest neighbor-based method for one-shot learning. This method makes use of the features produced by a pretrained convolutional neural network and builds a proximity forest to classify new classes. The algorithm is tested in two datasets with different scales and achieves reasonable high classification accuracy in both datasets. Furthermore, this thesis tries to understand the feature space to explain the success of our proposed method. A novel tool generalized curvature analysis is used to probe the feature space structure of the convolutional neural network. The results show that the feature space curves around samples with both known classes and unknown in-domain classes, but not around transition samples between classes or out-of-domain samples. In addition, the low curvature of out-of-domain samples is correlated with the inability of a pretrained convolutional neural network to classify out-of-domain classes, indicating that a pretrained model cannot generate useful feature representations for out-of-domain samples. In summary, this thesis proposes a new method for one-shot learning, and provides insight into understanding the feature space of convolutional neural networks.
Open Access
Socialeyes: developing a useful interface for the visually disabled
(Colorado State University. Libraries, 2011) Nagoshi, Erin, author; Draper, Bruce, advisor; Beveridge, Ross, committee member; Troup, Lucy, committee member
While many tools exist to help the visually disabled navigate, there are very few designed for social situations. Recent advancements in the field of facial recognition offer the opportunity to change that. This thesis begins a study of the human computer interaction challenges of developing usable interfaces for visual social aides.
Open Access
The impact of referent display on interaction proposals during multimodal elicitation studies
(Colorado State University. Libraries, 2021) Williams, Adam S., author; Ortega, Francisco R., advisor; Beveridge, Ross, committee member; Sharp, Julia, committee member
Elicitation studies have become a popular method of participatory design. While traditionally used for finding unimodal gesture-based inputs elicitation has been increasingly used for deriving multimodal interaction techniques. This is concerning as there has been no work that examines how well elicitation methods transfer from unimodal gesture use to multimodal combinations of inputs. This work details a comparison between two elicitation studies that were similar in design apart from the way participants were prompted for interaction proposals. Referents (e.g., commands to be executed) were shown as either text or animations. Interaction proposals for speech, gesture, and gesture+speech input modalities were elicited. Based on the comparison of these studies and other existing elicitation studies the concern of referent display priming uses proposed interaction techniques is brought to light. The results from these elicitation studies were not reproduced. Gesture proposals were the least impacted. With high similarity in the overall proposal space. Speech was biased to have proposals imitating the text as displayed an average of 69.36%. The time between gesture and speech initiation in multimodal use was 166.51% longer when prompted with text. The second contribution of this work is a consensus set of mid-air gesture inputs for use with generic object manipulations in augmented reality environments. This consensus set was derived from the elicitation study that used text-based referent displays which were found to be less biasing on participant gesture production than the animated referents.
Open Access
The mixing genetic algorithm for traveling salesman problem
(Colorado State University. Libraries, 2022) Varadarajan, Swetha, author; Whitley, Darrell, advisor; Böhm, Wim, committee member; Beveridge, Ross, committee member; Chong, Edwin, committee member; Pouchet, Louis-Noël, committee member
The Traveling Salesman Problem (TSP) is one of the most intensively studied NP-Hard problems. The TSP solvers are well-suited for multi-core CPU-based architectures. With the decline in Moore's law, there is an increasing need to port the codes to parallel architectures such as the GPU massively. This thesis focuses on the Genetic Algorithm (GA) based TSP solvers. The major drawback in porting the state-of-the-art GA based TSP solver (called the Edge Assembly Crossover (EAX)) are (a) the memory per crossover operation is large and limits the scalability of the solver (b) the communication per crossover operation is random and not favorable for the SIMD machines. We designed a new solver, the Mixing Genetic Algorithm (MGA), using the Generalized Partition Crossover (GPX) operator to overcome these aspects. The GPX consumes 4 x lesser memory and does not access the memory during crossover operation. The MGA is used in three different modes. (1) MGA can converge fast on problems smaller than 2,000 cities as a single solver. (2) As a hybrid solver, together with EAX, it speeds up the convergence rate for problems up to 85,900 cities. (3) In an ensemble setting, together with EAX and an iterated local search (called the Lin-Kernighan Helsgaun (LKH) heuristic), it increases the success rate of some of the hard TSP instances. The MGA is parallelized on shared memory (using OpenMP), distributed memory (using MPI), and GPU (using CUDA). A combination of OpenMP and MPI parallelization is examined on problems ranging between 5,000 to 85,900 cities. We show near-linear speedup (proportional to the number of parallel units) on these instances. Preliminary results on GPU parallelization of the GPX crossover operator partition phase show a 48x to 625x speedup over the naive sequential implementation. This is the first step towards the fine-grain parallelization of GA operators for TSP. The results are tested on problems ranging from 10,000 to 2M cities.
Open Access
Understanding user interactions in stereoscopic head-mounted displays
(Colorado State University. Libraries, 2022) Williams, Adam S., author; Ortega, Francisco R., advisor; Beveridge, Ross, committee member; Gersch, Joe, committee member; Sharp, Julia, committee member
Interacting in stereoscopic head mounted displays can be difficult. There are not yet clear standards for how interactions in these environments should be performed. In virtual reality there are a number of well designed interaction techniques; however, augmented reality interaction techniques still need to be improved before they can be easily used. This dissertation covers work done towards understanding how users navigate and interact with virtual environments that are displayed in stereoscopic head-mounted displays. With this understanding, existing techniques from virtual reality devices can be transferred to augmented reality where appropriate, and where that is not the case, new interaction techniques can be developed. This work begins by observing how participants interact with virtual content using gesture alone, speech alone, and the combination of gesture+speech during a basic object manipulation task in augmented reality. Later, a complex 3-dimensional data-exploration environment is developed and refined. That environment is capable of being used in both augmented reality (AR) and virtual reality (VR), either asynchronously or simultaneously. The process of iteratively designing that system and the design choices made during its implementation are provided for future researchers working on complex systems. This dissertation concludes with a comparison of user interactions and navigation in that complex environment when using either an augmented or virtual reality display. That comparison contributes new knowledge on how people perform object manipulations between the two devices. When viewing 3D visualizations, users will need to feel able to navigate the environment. Without careful attention to proper interaction technique design, people may struggle to use the developed system. These struggles may range from a system that is uncomfortable and not fit for long-term use, or they could be as major as causing new users to not being able to interact in these environments at all. Getting the interactions right for AR and VR environments is a step towards facilitating their widespread acceptance. This dissertation provides the groundwork needed to start designing interaction techniques around how people utilize their personal space, virtual space, body, tools, and feedback systems.