Home > Computer Science > Other > Volume-9 > Issue-2 > A Reinforcement Learning Agent for UAV Control: Mathematical Foundations, Implementation, and Human-vs-AI Benchmarking

A Reinforcement Learning Agent for UAV Control: Mathematical Foundations, Implementation, and Human-vs-AI Benchmarking

Call for Papers

Volume-9 | Advancements and Emerging Trends in Computer Applications - Innovations, Challenges, and Future Prospects

Last date : 25-Feb-2025

Best International Journal
Open Access | Peer Reviewed | Best International Journal | Indexing & IF | 24*7 Support | Dedicated Qualified Team | Rapid Publication Process | International Editor, Reviewer Board | Attractive User Interface with Easy Navigation

Journal Type : Open Access

First Update : Within 7 Days after submittion

Submit Paper Online

For Author

Research Area


A Reinforcement Learning Agent for UAV Control: Mathematical Foundations, Implementation, and Human-vs-AI Benchmarking


Neelesh Mungoli



Neelesh Mungoli "A Reinforcement Learning Agent for UAV Control: Mathematical Foundations, Implementation, and Human-vs-AI Benchmarking" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-9 | Issue-2, April 2025, pp.242-254, URL: https://www.ijtsrd.com/papers/ijtsrd76308.pdf

In this work, we propose a novel deep reinforcement learning (DRL) agent architecture for fully autonomous UAV control that fuses real-time sensor fusion with advanced multi-objective reward shaping to achieve robust flight dynamics under varied environmental conditions. We begin by defining the system’s decision-making process as a partially observable Markov decision process (POMDP), wherein the UAV’s state space encapsulates high-dimensional sensor inputs, including LIDAR point clouds, inertial measurement unit (IMU) data, and geospatial telemetry, while the agent’s action space is composed of continuous motor velocity commands. Our learning algorithm employs a hierarchical policy gradient method with parallelizable sub-policies dedicated to tasks such as obstacle avoidance, trajectory planning, and energy conservation. Each sub-policy is trained using a variant of proximal policy optimization (PPO) that is adapted to dynamic flight constraints through Lagrangian relaxation techniques and enforced via real-time on-policy updates.

-


IJTSRD76308
Volume-9 | Issue-2, April 2025
242-254
IJTSRD | www.ijtsrd.com | E-ISSN 2456-6470
Copyright © 2019 by author(s) and International Journal of Trend in Scientific Research and Development Journal. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0) (http://creativecommons.org/licenses/by/4.0)

International Journal of Trend in Scientific Research and Development - IJTSRD having online ISSN 2456-6470. IJTSRD is a leading Open Access, Peer-Reviewed International Journal which provides rapid publication of your research articles and aims to promote the theory and practice along with knowledge sharing between researchers, developers, engineers, students, and practitioners working in and around the world in many areas like Sciences, Technology, Innovation, Engineering, Agriculture, Management and many more and it is recommended by all Universities, review articles and short communications in all subjects. IJTSRD running an International Journal who are proving quality publication of peer reviewed and refereed international journals from diverse fields that emphasizes new research, development and their applications. IJTSRD provides an online access to exchange your research work, technical notes & surveying results among professionals throughout the world in e-journals. IJTSRD is a fastest growing and dynamic professional organization. The aim of this organization is to provide access not only to world class research resources, but through its professionals aim to bring in a significant transformation in the real of open access journals and online publishing.

Thomson Reuters
Google Scholer
Academia.edu

ResearchBib
Scribd.com
archive

PdfSR
issuu
Slideshare

WorldJournalAlerts
Twitter
Linkedin