Harnessing large language models for permission fidelity analysis from android application descriptions

Tamrakar, Yunik, author; Ray, Indrakshi, advisor; Banerjee, Ritwik, advisor; Ghosh, Sudipto, committee member; Simske, Steve, committee member

Harnessing large language models for permission fidelity analysis from android application descriptions

dc.contributor.author	Tamrakar, Yunik, author
dc.contributor.author	Ray, Indrakshi, advisor
dc.contributor.author	Banerjee, Ritwik, advisor
dc.contributor.author	Ghosh, Sudipto, committee member
dc.contributor.author	Simske, Steve, committee member
dc.date.accessioned	2025-06-02T15:20:02Z
dc.date.available	2026-05-28
dc.date.issued	2025
dc.description.abstract	Android applications are very popular these days and as of mid-2024 there are over 2 million applications in the Google Play Store. With such a large number of applications available for download, the threat of privacy leakage increases considerably, primarily due to the users' limited knowledge in distinguishing the necessary app permissions. This makes accurate and consistent checking of the permissions collected by the applications necessary to ensure the protection of the user's privacy. Studies have indicated that inferring permissions from app descriptions is an effective way to determine whether the collected permissions are necessary or not. Previous research in the permission inference space has explored techniques such as keyword-based matching, Natural Language Processing methods (including part-of-speech tagging and named entity recognition), as well as deep learning based approaches using Recurrent Neural Networks. However, app descriptions are often vague and may omit details to meet sentence length restrictions, resulting in suboptimal performance of these models. This limitation motivated our choice of large language models (LLMs), as their advanced contextual understanding and ability to infer implicit information can directly address the weaknesses observed in previous approaches. In this work, we explore various LLM architectures for the permission inference task and provide a detailed comparison across various models. We evaluate both zero-shot learning and fine-tuning based approaches, demonstrating that fine-tuned models can achieve state-of-the-art performance. Additionally, by employing targeted generative AI based training data augmentation techniques, we show that these fine-tuned models can significantly outperform baseline methods. Furthermore, we illustrate the potential of leveraging paraphrasing to boost fine-tuned performance by over 50 percent, all while using only a very small number of annotated samples—a rarity for LLMs.
dc.format.medium	born digital
dc.format.medium	masters theses
dc.identifier	Tamrakar_colostate_0053N_18881.pdf
dc.identifier.uri	https://hdl.handle.net/10217/240955
dc.identifier.uri	https://doi.org/10.25675/3.04936
dc.language	English
dc.language.iso	eng
dc.publisher	Colorado State University. Libraries
dc.relation.ispartof	2020-
dc.rights	Copyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright.
dc.rights.access	Embargo expires: 05/28/2026.
dc.subject	android permissions
dc.subject	LLM
dc.subject	privacy
dc.subject	compliance
dc.subject	android applications
dc.subject	NLP
dc.title	Harnessing large language models for permission fidelity analysis from android application descriptions
dc.type	Text
dcterms.embargo.expires	2026-05-28
dcterms.embargo.terms	2026-05-28
dcterms.rights.dpla	This Item is protected by copyright and/or related rights (https://rightsstatements.org/vocab/InC/1.0/). You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
thesis.degree.discipline	Computer Science
thesis.degree.grantor	Colorado State University
thesis.degree.level	Masters
thesis.degree.name	Master of Science (M.S.)

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Tamrakar_colostate_0053N_18881.pdf
Size:: 4.83 MB
Format:: Adobe Portable Document Format

Download

Collections

2020-
Theses and Dissertations