Repository logo
 

Optimizing machine learning models for autonomous vehicles

Abstract

Object detectors (ODs) stand as a cornerstone of modern computer vision tasks, increasingly essential in a wide array of consumer applications. Its utility spans enhancing surveillance and security systems, enabling mobile text recognition for digital document accessibility, and facilitating the diagnosis of diseases through advanced imaging techniques like MRI and CT scans. This multifaceted technology is pivotal across various domains, with one of its most critical applications being autonomous driving. Autonomous vehicles (AVs) rely extensively on their ability to perceive and interpret their surroundings, a capability fundamental to ensuring safe and reliable driving performance. Sophisticated perception systems in these vehicles utilize state-of-the-art object detection algorithms, both 2D and 3D, to accurately identify and localize various objects within the vehicle's operational vicinity. 2D ODs are designed to detect and localize objects in images or video frames, providing information in the form of bounding boxes on a 2-dimensional plane. They are less complex and computationally demanding compared to 3D detectors and are commonly used in applications like image recognition, face detection, and pedestrian detection in surveillance systems. Models such as YOLO, SSD, and Faster R-CNN are widely used examples of 2D ODs. Conversely, 3D ODs incorporate depth information to detect and localize objects in a three-dimensional space, utilizing data from 3D sensors like LiDAR, stereo cameras, or depth cameras. These detectors are essential for applications requiring a precise understanding of the environment, such as autonomous driving, robotics, and augmented reality. Popular models include PointNet, VoxelNet, and Frustum PointNet. The data provided by these ODs, especially when combining 2D and 3D capabilities, is indispensable for informing crucial driving decisions and enabling the vehicle to navigate complex environments with enhanced safety and efficiency. However, these advanced ODs come with high memory and computational overheads, which pose significant challenges. To address this challenge, ongoing research and development efforts are dedicated to optimizing these models. The primary goal is to reduce their memory footprint and computational requirements while maintaining or even improving their performance. This ensures that these sophisticated algorithms can be efficiently deployed on resource-constrained embedded platforms, often used in AVs, without compromising their effectiveness. Such advancements are pivotal in maintaining the efficiency and reliability of AVs, further solidifying the indispensable role of ODs in modern technology. This thesis introduces two novel OD optimization algorithms, which can reduce model footprint and computation cost while decreasing the inference time of the model. The first contribution, R-TOSS, is a novel semi-structured pruning framework for 2D ODs. R-TOSS outperforms various state-of-the-art model optimization techniques while also improving performance on embedded resource-constrained platforms. For accelerating 3D ODs, we propose UPAQ, which uses a combination of pruning and quantization to improve model accuracy and reduce model footprint. We also showcase how UPAQ outperforms other state-of-the-art models in terms of performance.

Description

Rights Access

Subject

machine learning
object detection
quantization
model compression
autonomous vehicles
pruning techniques

Citation

Associated Publications