Repository logo
 

Towards interactive analytics over voluminous spatiotemporal data using a distributed, in-memory framework

dc.contributor.authorMitra, Saptashwa, author
dc.contributor.authorPallickara, Sangmi Lee advisor
dc.contributor.authorPallickara, Shrideep, committee member
dc.contributor.authorOrtega, Francisco, committee member
dc.contributor.authorLi, Kaigang, committee member
dc.date.accessioned2024-01-15T11:00:29Z
dc.date.available2024-01-15T11:00:29Z
dc.date.issued2023
dc.description.abstractThe proliferation of heterogeneous data sources, driven by advancements in sensor networks, simulations, and observational devices, has reached unprecedented levels. This surge in data generation and the demand for proper storage has been met with extensive research and development in distributed storage systems, facilitating the scalable housing of these voluminous datasets while enabling analytical processes. Nonetheless, the extraction of meaningful insights from these datasets, especially in the context of low-latency/ interactive analytics, poses a formidable challenge. This arises from the persistent gap between the processing capacity of distributed systems and their ever-expanding storage capabilities. Moreover, the interactive querying of these datasets is hindered by disk I/O, redundant network communications, recurrent hotspots, transient surges of user interest over limited geospatial regions, particularly in systems that concurrently serve multiple users. In environments where interactive querying is paramount, such as visualization systems, addressing these challenges becomes imperative. This dissertation delves into the intricacies of enabling interactive analytics over large-scale spatiotemporal datasets. My research efforts are centered around the conceptualization and implementation of a scalable storage, indexing, and caching framework tailored specifically for spatiotemporal data access. The research aims to create frameworks to facilitate fast query analytics over diverse data-types ranging from point, vector, and raster datasets. The frameworks implemented are characterized by its lightweight nature, residence primarily in memory, and their capacity to support model-driven extraction of insights from raw data or dynamic reconstruction of compressed/ partial in-memory data fragments with an acceptable level of accuracy. This approach effectively helps reduce the memory footprint of cached data objects and also mitigates the need for frequent client-server communications. Furthermore, we investigate the potential of leveraging various transfer learning techniques to improve the turn-around times of our memory-resident deep learning models, given the voluminous nature of our datasets, while maintaining good overall accuracy over its entire spatiotemporal domain. Additionally, our research explores the extraction of insights from high-dimensional datasets, such as satellite imagery, within this framework. The dissertation is also accompanied by empirical evaluations of our frameworks as well as the future directions and anticipated contributions in the domain of interactive analytics over large-scale spatiotemporal datasets, acknowledging the evolving landscape of data analytics where analytics frameworks increasingly rely on compute-intensive machine learning models.
dc.format.mediumborn digital
dc.format.mediumdoctoral dissertations
dc.identifierMitra_colostate_0053A_18041.pdf
dc.identifier.urihttps://hdl.handle.net/10217/237507
dc.languageEnglish
dc.language.isoeng
dc.publisherColorado State University. Libraries
dc.relation.ispartof2020-
dc.rightsCopyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright.
dc.subjectdistributed caching
dc.subjectscience-guided machine learning
dc.subjectdata cubes
dc.subjectvisual analytics
dc.subjectin-memory storage
dc.titleTowards interactive analytics over voluminous spatiotemporal data using a distributed, in-memory framework
dc.typeText
dcterms.rights.dplaThis Item is protected by copyright and/or related rights (https://rightsstatements.org/vocab/InC/1.0/). You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
thesis.degree.disciplineComputer Science
thesis.degree.grantorColorado State University
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy (Ph.D.)

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Mitra_colostate_0053A_18041.pdf
Size:
7.25 MB
Format:
Adobe Portable Document Format