Objective: The primary objective of this study is to develop a QoS-aware task scheduling algorithm for LoRaWAN IoT applications using a Reinforcement Learning (RL) approach.
Introduction: LoRaWAN is a widely adopted Low Power Wide Area Network (LPWAN) protocol designed for Internet of Things (IoT) applications due to its long-range communication and low power consumption. However, ensuring QoS in LoRaWAN networks remains challenging due to limited bandwidth, high device density, and dynamic traffic patterns. Existing scheduling algorithms often fail to balance competing QoS requirements effectively. Reinforcement Learning (RL) offers a promising solution by enabling intelligent decision-making through interaction with the network environment.
Case representation: The proposed model employs a Deep Q-Network (DQN) to optimize task scheduling in LoRaWAN networks. The RL agent interacts with a simulated LoRaWAN environment built using NS-3, where it learns to make scheduling decisions based on real-time network states. Key parameters, such as delay, PDR, PER, and throughput, are used as inputs to the reward function to guide the learning process. Performance is evaluated against existing models like RT-LoRa, and LoRa+ under varying node densities and traffic scenarios.
Result: The simulation results demonstrate that the proposed RL-based task scheduling algorithm outperforms existing models across multiple Quality of Service (QoS) metrics. It achieves the lowest delay at approximately 40 ms, significantly outperforming RT-LoRa, which has a delay of around 120 ms, and LoRa+, which experiences a delay of about 80ms. In terms of Packet Delivery Ratio (PDR), the model maintains a competitive value of approximately 85%, comparable to LoRa+ at 87%. Additionally, it records the lowest Packet Error Rate (PER) at around 5%, outperforming RT-LoRa and LoRa+, which exhibit PER values of approximately 15% and 10%, respectively. Furthermore, the model achieves the highest throughput of approximately 250 kbps, surpassing RT-LoRa at 150 kbps and LoRa+ at 200 kbps, demonstrating its superior performance in optimizing network efficiency.
Discussion: The proposed model demonstrates significant strengths in reducing delay and PER while maximizing throughput, making it suitable for time-sensitive IoT applications. However, its marginal improvement in PDR compared to existing models highlights an area for further optimization. Additionally, energy efficiency was not explicitly addressed in this study, which is critical for LPWAN applications like LoRaWAN. These limitations suggest potential directions for future research.
Conclusion: This research successfully develops a QoS-aware task scheduling algorithm using reinforcement learning for LoRaWAN IoT applications. By dynamically adapting to network conditions, the proposed model achieves superior performance across multiple QoS metrics compared to state-of-the-art algorithms. Future work will focus on incorporating energy efficiency into the model and extending its applicability to multi-gateway scenarios.
AI: Artificial Intelligence; CF: Carrier Frequency; Data Extraction Rate; Deep Neural Network; DQN: Deep Q-Network; DRL: DRL: Deep Reinforcement Learning; DSR: Design Science Research; ILP: Integer Linear Programming; IoT: Internet of Things; ISM: Industrial Scientific and Medical; LoRa: Long Range (a physical layer technology); LoRaSim: LoRa Network Simulator; LoRaWAN; Long Range Wide Area Network; LPWAN: Low Power Wide Area Network; MAC: Medium Access Control; MCUs: Microcontrollers; MILP: Mixed Integer Linear Programming; NS-3: Network Simulator-3; PDR: Packet Delivery Ratio; PER: Packet Error Rate; PST: Priority Scheduling Technique; QoS: Quality of Service; ReLU; Rectified Linear Unit; RL: Reinforcement learning; RT-LoRa: Real-Time LoRa; SF: Spreading Factor; SINR: Signal-to-Interference-plus-Noise Ratio; TCP/IP: Transmission Control Protocol/Internet Protocol.
The Internet of Things (IoT) encompasses a vast network of interconnected devices that communicate and exchange data over the Internet, impacting various sectors such as smart cities, healthcare, agriculture, and industry. The rapid expansion of IoT applications has created a pressing need for efficient resource allocation and task scheduling mechanisms to optimize resource utilization while meeting Quality of Service (QoS) requirements [1].
LoRaWAN (Long Range Wide Area Network) is highlighted as a significant enabler for IoT, designed to provide long-range communication with low power consumption. This wireless communication protocol is particularly optimized for IoT devices, allowing them to transmit small amounts of data over considerable distances. LoRaWAN's capabilities make it suitable for applications requiring remote monitoring and data acquisition, thus facilitating the expansion of IoT solutions [1,2]. For example, LoRaWAN which is LPWAN technology can connect battery-powered devices at very long distances while consuming minimum power, hence making it affordable [3].
LoRaWAN operates in the unlicensed ISM bands, which vary according to region [4]. It employs chirp spread spectrum modulation techniques to attain long-distance communication with low power [5]. One of the main advantages of LoRaWAN is its remarkable coverage. For this reason, it can transmit data within several kilometers in open settings such as rural areas or large industrial facilities without the need for cellular towers and other infrastructure items. Consequently, LoRaWAN is best suitable for applications that need a wider coverage area, such as smart agriculture, asset tracking, environmental monitoring, and smart city deployments [1]. Hence, LoRaWAN has become an attractive technology for IoT applications due to its unique combination of long-range capability, low power consumption, and cost-effective deployment [6] and LoRaWAN relies on four key components [7].
Figure 1, depicts the overall architecture of a LoRaWAN network, highlighting its key components and their interactions. The architecture consists of end devices (sensors), gateways, a network server, and application servers, illustrating how data flows from the end devices to the application layer. End devices communicate wirelessly with gateways using LoRa technology, which then forwards the data to the network server, where it is processed and routed to the appropriate application server for further analysis or action, showcasing the hierarchical structure and functionality of the LoRaWAN ecosystem.
In LoRaWAN IoT applications, maintaining Quality of Service (QoS) is crucial due to challenges like limited resources, channel congestion, and varying QoS requirements, which can lead to issues such as high latency and packet loss. Reinforcement Learning (RL) is identified as the most suitable machine learning approach for dynamic task scheduling in LoRaWAN networks, as it can adapt to changing conditions and optimize multiple QoS metrics simultaneously. By leveraging RL, nodes can self-optimize scheduling performance, enhancing reliability and efficiency in diverse applications such as smart agriculture, industrial IoT, and smart city management [8,9,6].
The study proposes the use of Reinforcement learning (RL) techniques to develop a scheduling algorithm that can adapt to dynamic network conditions, optimize energy consumption, and enhance overall system performance. By leveraging RL, the proposed solution aims to improve latency, reliability, and efficiency in LoRaWAN networks, ultimately contributing to the sustainability and scalability of IoT deployments [9,10].
Reinforcement Learning (RL) is uniquely suited for dynamic task scheduling in LoRaWAN due to its ability to learn optimal policies through interaction with the environment without requiring labeled data. Unlike supervised learning, which relies on historical datasets and struggles with real-time adaptability, RL agents make decisions based on feedback (rewards) from the current network state, allowing them to respond to changes in node density, channel interference, and traffic loads dynamically. Moreover, traditional optimization or classification-based ML approaches often lack the capacity to handle sequential decision-making over time, which is essential in scheduling tasks where present actions impact future network states. RL's strength lies in continuously adapting its policy to maximize long-term performance across multiple QoS metrics, making it a natural fit for the resource-constrained, time-sensitive, and stochastic nature of LoRaWAN-based IoT systems.
Unlike prior scheduling approaches that rely on static heuristics, clustering, or centralized optimization models, the novelty of the proposed method lies in its integration of a Deep Q-Network (DQN) within a dynamic and adaptive task scheduling framework for LoRaWAN. Specifically, the model introduces a context-aware reinforcement learning agent that continuously observes real-time network states; such as channel conditions, node congestion, task urgency, and signal quality and learns an optimal scheduling policy through interaction with the environment. This real-time adaptability enables the system to prioritize tasks and allocate channels or gateways effectively, even under fluctuating traffic loads and node densities. Furthermore, the design of a multi-metric reward function that simultaneously optimizes delay, PDR, PER, and throughput distinguishes this work from most existing solutions that optimize only one or two QoS metrics. To the best of our knowledge, this is among the first studies to deploy DQN-based scheduling in a simulated LoRaWAN environment using NS-3, validated across multiple network configurations and performance benchmarks.
The remaining parts of this Section are arranged as follows: Section 2, presents a comprehensive review of the existing literature on LoRaWAN technology, scheduling algorithms, and reinforcement learning in IoT. It provides the theoretical foundation for the research and identifies the gaps that this thesis aims to address. Section3 covers the details of the research methodology employed in this study, including the research design, development tools, algorithm design, implementation, complexity analysis, and reward function design. the analysis of the simulation results. It describes the simulation setup and scenarios and discusses the performance of the proposed algorithm in terms of key QoS metrics, will be covered in Section IV. Lastly, the conclusion covered in Section 5.
LPWANs like LoRaWAN have revolutionized the IoT by enabling long-range communication with battery-powered devices. However, IoT applications within the IoT domain demand reliable and expedited data delivery, posing challenges for LoRaWAN due to inherent limitations in range, latency, and energy constraints [11]. This review explores existing research related to Task Scheduling in LoRaWAN.
The paper [12] proposes a dynamic transmission Priority Scheduling Technique (PST) based on an unsupervised learning clustering algorithm for dense LoRaWAN networks. The LoRa gateway classifies nodes into different priority clusters, and the dynamic PST allows the gateway to configure transmission intervals based on cluster priorities. This approach aims to improve transmission delay and decrease energy consumption. Simulation results suggest that the proposed work outperforms conventional LoRaWAN and recent clustering and scheduling schemes, making it potentially well-suited for dense LoRaWAN deployments.
In [13] a Real-Time LoRa (RT-LoRa) communication protocol for industrial Internet of Things applications is introduced. The real-time flow is processed by the RT-LoRa using a medium access strategy. Static and movable nodes are used to build the entire network. The QoS level is regarded as being the same for every static node. Three classes-normal, dependable, and most reliable-are used to categorize the QoS level for flows produced by mobile nodes. The technique distributes SF and CF based on the QoS level. A star topology is used to arrange and connect the mobile and static nodes to the gateway. The following are the important points raised in this paper: For a single gateway network using single-hop communication, the general process is described. Even within 180 meters, this results in a significant transmission delay of up to 28 seconds for the majority of dependable flows. This study has not addressed the need for greater coverage and reduced time delay for industrial data in real time. There are limitations in QoS provisioning because the QoS level is only assigned to mobile nodes and all static node flows are given the same priority level. All nodes need a lot of energy to connect with the central gateway, and nodes farther from the gateway use even more energy.
The paper [14] proposes a method to optimize the performance of LoRaWAN networks by dynamically assigning values for the Spreading Factor and Carrier Frequency radio parameters. This assignment is formulated as a Mixed Integer Linear Programming problem to maximize network metrics like Data Extraction Rate and minimize packet collisions. An approximation algorithm is also developed to solve the problem more efficiently at scale. The results show improved performance for metrics like DER and an average 6-13% fewer packet collisions compared to baseline policies. The performance evaluation of the proposed optimization algorithms is done through simulation using the LoRaSim simulator. The optimization focuses on optimizing just the SF and CF parameters of the LoRa radio configuration. Considering additional parameters could lead to even better performance. The simulations assume a single gateway setup. Therefore, in summary, the key limitations are limited configuration parameters, static network assumptions and evaluation based on few metrics.
In [15], the authors explore the viability of real-time communication within LoRaWAN-based IoT systems. Leveraging an Integer Linear Programming (ILP) model, they assess the feasibility of real-time communication during the network design stage. This model not only determines feasibility but also optimizes the number and placement of gateways necessary to achieve real-time requirements. The paper further validates the model's performance through various scenarios, offering valuable insights into LoRaWAN's scalability and real-time support limitations. However, it is important to note that the model primarily focuses on static network design at deployment. This may not fully capture the dynamic nature of real-world networks, where factors like interference, congestion, and gateway availability can significantly affect real-time QoS performance.
In [16], the authors present a low-overhead synchronization and scheduling concept implemented on top of LoRaWAN Class A. They design and deploy an end-to-end architecture on STM32L0 Microcontrollers (MCUs), where a central entity provides synchronization metrics and allocates transmission slots. By measuring clock drift in devices, the system defines slot lengths within the network. This approach achieves 10-millisecond accuracy and demonstrates significant improvements in packet delivery ratios compared to Aloha-based setups, especially under high network loads. Notably, the paper addresses the gap in the literature regarding experimental approaches to LoRaWAN scheduling and demonstrates the feasibility of the proposed concept. However, the paper does not delve into the energy consumption impact of the implemented scheduling algorithms.
Despite various efforts to improve task scheduling in LoRaWAN, existing methods exhibit notable limitations. Many approaches, such as clustering-based scheduling and MILP-based resource optimization, operate under static assumptions or rely on centralized architectures that limit scalability and adaptability in dynamic environments. Others fail to prioritize tasks effectively or consider a limited subset of QoS parameters. Additionally, most traditional algorithms lack the ability to learn and adapt in real time to fluctuating network conditions, leading to suboptimal performance under high traffic or dense node scenarios. The proposed RL-based scheduling algorithm addresses these gaps by employing a Deep Q-Network (DQN) that continuously learns optimal scheduling policies through interaction with the network environment. Unlike static or rule-based strategies, the RL agent dynamically adapts to changes in node density, traffic load, and channel conditions, optimizing multiple QoS metrics—including delay, Packet Delivery Ratio (PDR), Packet Error Rate (PER), and throughput—simultaneously. This adaptive capability enables the model to outperform existing approaches in both efficiency and scalability, making it more suitable for real-time, large-scale IoT applications (Table 1).
| Table 1: Summary of related papers. | |||
| References | Key Features | Result | Critique |
| [12] | Improves efficiency of LoRaWAN network | Reduces packet collision rate, enhances transmission delay, and improves energy consumption in dense LoRaWAN networks. | It doesn’t address task prioritization and also the use of reinforcement further improves the efficiency of dynamic transmission priority scheduling, leading to better decision-making, optimization of task scheduling and energy utilization in LoRaWAN IoT applications. |
| [13] | Modifies the LoRaWAN MAC protocol to reduce rejected packet rate and packet error rate. | Improves Quality of Service in terms of rejected packet rate and packet error rate. | It follows the same SF and CF in case of high RSSI of received data. This leads to higher energy consumption and data loss. |
| [14] | Designed to support real- time flows for industrial IoT applications. Centralized approach where a central node manages medium access according to a predefined order. | Enables bounded latency for real-time flows. More suitable for industrial IoT applications with predictable traffic patterns. | The overall procedure is presented for a single gateway network through single-hop communication. This consumes large transmission delay up to 28s for most reliable flows even within 180 meters. The QoS level is assigned for mobile nodes only and all static node flows are assigned with same priority level, which introduces restrictions in QoS, provisioning. |
| [15] | Optimizes radio resource allocation using Mixed Integer Linear Programming (MILP) to improve data extraction rate, reduce packet collision rate, and minimize energy consumption. | Achieves significant improvement in data extraction rate and reduction in collisions compared to traditional allocation policies. | Relies on a centralized approach for radio resource management, which might not be scalable for very large networks. Might incur higher computational complexity compared to simpler scheduling algorithms. |
The research methodology focuses on designing and implementing a Reinforcement Learning (RL)--based scheduling algorithm for reliable data delivery in LoRaWAN networks. It adopts a Design Science Research (DSR) approach, which emphasizes systematic development and evaluation of practical solutions to address inefficiencies in existing task scheduling mechanisms. The methodology begins with a detailed description of the research design, which emphasizes the need for a task-scheduling algorithm that can effectively manage resources in dynamic environments. The study identifies the limitations of existing scheduling methods in LoRaWAN networks, particularly their inability to meet the QoS demands of modern IoT applications. To address these challenges, the research proposes a Reinforcement Learning (RL) based algorithm that can adapt to varying network conditions and optimize resource allocation.
The research employs a mixed-methods approach, combining quantitative research with design science to systematically design, develop, and assess a QoS-aware task-scheduling algorithm. This approach allows for addressing questions related to the effectiveness of the proposed algorithm in improving QoS in dynamic IoT environments.
Algorithm design: The design of the RL-based scheduling algorithm focuses on creating an intelligent agent that optimizes task scheduling in a LoRaWAN environment. Key components include defining the state space, action space, and reward function, which guide the agent's learning process to make optimal scheduling decisions based on network conditions.
Figure 2, illustrates the architecture of a Deep Q-network (DQN), which combines Q-learning with deep neural networks to enable reinforcement learning in complex environments. The architecture typically consists of the following key components:
The diagram shows the agent taking action in the environment, receiving a new state and reward, and updating its policy based on the experience. This iterative process allows the agent to learn an optimal policy for maximizing rewards in the environment.
Here's a breakdown of the diagram's elements:
In the context of LoRaWAN, the input layer of the DQN receives a state vector that encapsulates key network parameters representing the current environment status. These include normalized values of channel conditions (e.g., signal-to-interference-plus-noise ratio), gateway congestion levels, task queue lengths, packet retransmission counts, and node-specific information such as remaining transmission energy and task deadlines. By encoding these factors into the input layer, the DQN can interpret the real-time status of the LoRaWAN network and make informed scheduling decisions that adapt to dynamic conditions. This mapping ensures that the agent's learning process is grounded in practical, context-specific observations relevant to QoS optimization in LoRaWAN-based IoT systems.
Figure 3, outlines the training phase of the proposed scheduling algorithm, which utilizes a Deep Q-Network (DQN) approach to optimize task scheduling in a LoRaWAN environment. The training phase consists of several key steps:
Figure 4, presents a diagram of the trained proposed scheduling algorithm, illustrating the workflow and key components involved in the task scheduling process within a LoRaWAN environment. The diagram outlines the following steps:
Overall figure 4, effectively illustrates the structured workflow of the trained scheduling algorithm, highlighting the interaction between task requests, network state retrieval, schedule generation, feasibility evaluation, and adjustments made by the RL agent to optimize task scheduling in a LoRaWAN network.
The implementation of the RL-based scheduling algorithm involves translating the designed components into a functional system that operates within the simulated LoRaWAN environment. This process includes several key steps:
Overall, the implementation phase focuses on creating a working model of the algorithm that can learn and adapt to improve network performance in real-time scenarios.
The LoRaWAN network improved task scheduling algorithm focuses on channel selection, task priority, and adaptive gateway placement in order to achieve better QoS parameters. The RL agent interacts with the LoRaWAN environment, observes network states, selects actions based on policy, receives rewards, and updates its knowledge to optimize QoS metrics like delay, reliability, throughput, and energy efficiency.
Pseudocode structure
Algorithm 1 Initialization
Algorithm 2 State Observation
Algorithm 3 Action Selection using ϵ-Greedy Policy
Algorithm 4 Environment Interaction
Algorithm 5 Reward Calculation
Algorithm 6 Q-Value Update (Learning)
Algorithm 7 Training Loop
Algorithm 8 Policy Improvement and Execution
The algorithm complexity analysis encompasses three main aspects: time complexity, space complexity, and scalability and feasibility.
Overall, the analysis highlights the algorithm's computational demands and its potential for effective deployment in resource-constrained environments.
The reward function design is a critical component of the proposed scheduling algorithm, as it directly influences the learning process of the Reinforcement Learning (RL) agent and the quality of scheduling decisions. The reward function is structured as a weighted sum of various Quality of Service (QoS) metrics, including delay minimization, reliability maximization, and throughput optimization.
Overall, the reward function design is pivotal in shaping the agent's behavior, promoting effective scheduling strategies that meet the dynamic demands of LoRaWAN networks.
The simulation setup and scenarios section outline the environment and parameters used to evaluate the proposed scheduling algorithm in a LoRaWAN context.
Overall, this section emphasizes the careful design of the simulation environment and parameters to facilitate a comprehensive analysis of the proposed scheduling algorithm's performance in realistic scenarios.
Table 2, outlines the key parameters used in the simulation of the LoRaWAN network to evaluate the proposed scheduling algorithm. The parameters include:
| Table 2: Simulation parameters. | |
| Parameter | Value |
| Number of gateways | 3 |
| Number of IoT devices | 100 |
| Network server | 1 |
| Environment size | 200 m x 200 m |
| Maximum distance to gateway | 200 m |
| Propagation model | LoRa log normal shadowing model |
| Number of retransmissions | 5 (Max) |
| Frequency band | 868 MHz |
| Spreading factor | SF7, SF8, SF9, SF10, SF11, SF12 |
| Number of rounds | 1000 |
| Voltage | 3.3 v |
| Bandwidth | 125 kHz |
| Payload length | 10 bytes |
| Timeslot technique | CSMA10 |
| Data rate (max) | 250 kbps |
| Number of channels | 5 |
| Simulation time | 600 seconds |
These parameters are carefully chosen to create a realistic medium-scale LoRaWAN environment, enabling the investigation of Quality of Service (QoS) metrics and the effectiveness of the scheduling algorithm.
The above parameters have been chosen in order to perform a realistic LoRaWAN environment and investigate the QoS metrics in IoT applications. The selected parameters aim to simulate a realistic medium-scale LoRaWAN IoT network that offers a good balance between the complexity of the network and communication reliability and computational efficiency for reinforcement learning. They rely on widely adopted real-world LoRaWAN configurations but provide the flexibility needed to effectively test a range of QoS and scheduling algorithms.
The Spreading Factors (SF7 to SF12) listed in table 2 represent the range of modulation configurations available in LoRaWAN, each offering a trade-off between data rate and transmission range. In the simulation, these SFs are dynamically assigned by the reinforcement learning agent based on the current network state. Nodes farther from the gateway or in poor signal conditions are assigned higher SFs (e.g., SF11 or SF12) to ensure reliable communication, albeit with lower data rates. Conversely, nodes with stronger signal quality or closer proximity to gateways use lower SFs (e.g., SF7 or SF8) to achieve faster data transmission and reduce channel occupancy. This adaptive SF allocation is a key aspect of the scheduling strategy, contributing to optimized QoS across diverse network scenarios.
The selection of the algorithm's parameters generally affects a sizable portion of the outcomes of the RL-based scheduling method. The ensuing sections outline the recommended practices for modifying the primary parameters as well as how the modifications affect algorithms.
The parameters and tuning strategies for the algorithm are crucial for optimizing performance:
Table 3, presents the key parameters utilized in the reinforcement learning-based scheduling algorithm, which are crucial for its performance and effectiveness. The parameters include:
| Table 3: Algorithm parameters. | |
| Parameter | Value |
| Number of hidden layers | 2 |
| Number of neurons per layer | 128 |
| Learning rate | 0.001 |
| Discount factor (gamma) | 0.95 |
| Exploration rate (epsilon) | 1.0 |
| Exploration decay rate | 0.995 |
| Minimum exploration rate | 0.01 |
| Replay buffer size | 30,000 |
| Batch size | 64 |
| Target network update frequency | Every 500 Steps |
| Activation function | Relu |
| Optimizer | Adam |
| Loss function | Mean squared error |
These algorithm parameters are essential for tuning the performance of the scheduling algorithm, ensuring effective learning and adaptation to the dynamic conditions of the LoRaWAN network.
The Performance Metrics Analysis evaluates the effectiveness of the proposed algorithm using key indicators such as delay, reliability, and throughput. The analysis demonstrates significant improvements in these metrics compared to baseline policies, highlighting the algorithm's ability to optimize QoS in LoRaWAN networks. Overall, the results indicate that the RL-based scheduling approach enhances network performance, particularly in managing overlapping QoS requirements.
Network delay: Figure 5, delay performance in different LoRaWAN node configurations. The results show that the delay for the RL-Based Algorithm (DQN) is considerably lower than LoRa+ and RT- LoRa. It is because the RL-Based Algorithm adopts adaptive decisions, optimally scheduling tasks and allocating resources. Illustrates the relationship between network delay and the number of nodes in a LoRaWAN environment.
Overall, figure 5, indicates, the proposed RL-based model achieves the lowest delay, approximately 40ms, significantly outperforming all other models. In comparison, RT-LoRa exhibits the highest delay at around 120 ms, while LoRa+ perform better than RT-LoRa but still experience delays exceeding 80ms. Adaptive Spatial Scheduling demonstrates moderate performance with a delay of approximately 90ms. The strength of the RL-based model lies in its ability to optimize task scheduling effectively, minimizing delays and making it highly suitable for time-sensitive IoT applications.
Overall figure 6, indicates, the proposed RL-based model achieves a Packet Delivery Ratio (PDR) of approximately 85%, which is comparable to other models. RT-LoRa and LoRa also attain PDRs in the range of 80% to 85%. The strength of the RL-based model lies in its ability to maintain high reliability in data delivery while optimizing for other performance metrics. However, its weakness is the marginal improvement over existing models, indicating limited novelty in this aspect.
It can be easily observed that the minimum PER is given by the RL-Based Algorithm, while the RT-LoRa and LoRa+ give higher values. The low PER of the RL-Based Algorithm is mainly contributed by its dynamic optimization of scheduling and resource allocation, which decreases packet collisions, interference, and transmission errors by a great amount. Figure 7, illustrates the relationship between Packet Error Rate (PER) and the number of nodes in a LoRaWAN network.
Overall, figure 7, emphasizes the correlation between node density and packet error rates, showcasing the effectiveness of the proposed RL-based algorithm in maintaining low error rates in a congested network. The proposed RL-based model exhibits the lowest Packet Error Rate (PER) at approximately 5%, outperforming all other models. In comparison, RT-LoRa has the highest PER at around 15%, while LoRa+ show moderate values of approximately 10%. The strength of the RL-based model lies in its ability to minimize errors, demonstrating its robustness in maintaining data integrity. No weaknesses were observed in this metric.
Throughput: Figure 8, illustrates the relationship between throughput and the number of nodes in a LoRaWAN network. It shows that as the number of nodes increases, the throughput achieved by the RL-based algorithm (DQN) remains significantly higher compared to other algorithms like RT-LoRa and LoRa+. This superior performance is attributed to the RL-based algorithm's dynamic optimization of scheduling decisions, which effectively balances network load and minimizes collisions, resulting in enhanced data transmission rates even as node density increases.
The proposed RL-based model achieves the highest throughput, approximately 250 kbps, significantly outperforming all other models. In contrast, RT-LoRa has the lowest throughput at around 150 kbps, while LoRa+ throughput of approximately 200 kbps. The strength of the RL-based model lies in its ability to maximize network efficiency by optimizing resource allocation.
The study compares the performance of the proposed RL-based task scheduling algorithm with other existing algorithms across several QoS metrics (delay, PDR, PER, throughput), the following statistical tests are recommended:
Validate the Sample Size
If not, additional simulation runs are necessary.
Assess the efficiency of feature selection
Train two versions of the RL-based model:
| Table 4: Summary of statistical tests. | |||
| Analysis | Test used | Objective | Interpretation |
| Validate sample size | Power analysis | Ensure sufficient simulation runs for reliable results | Sample size meets/exceeds required value → Reliable; otherwise → Increase simulations |
| Assess feature selection efficiency | Paired t-test or wilcoxon | Determine if selected QoS metrics improve model performance | p < 0.05 p < 0.05: Feature selection improves performance; p ≥ 0.05 p ≥ 0.05: No significant improvement |
| Compare model performance | ANOVA + Tukey’s HSD | Identify significant differences in QoS metrics among models | p < 0.05 p < 0.05: Significant differences exist; post-hoc tests identify which models differ |
By applying these statistical tests to your study's data and simulation results (Figures 5,8), you can validate your findings rigorously and provide evidence for your conclusions regarding sample size adequacy, feature selection efficiency, and classifier performance differences.
The performance of the proposed RL-based model is compared with other related work models (RT-LoRa, and LoRa+) across all QoS metrics (delay, PDR, PER, and throughput). The analysis is based on the results presented in figures 5,8 of the thesis (Table 5).
| Table 5: Performance comparison table. | |||||
| Metric | RT-LoRa | LoRa+ | Proposed RL-Based Model | Strengths of Proposed Model | Weaknesses of Proposed Model |
| Delay | Highest delay (~ 120 ms), particularly in high-density scenarios. | Moderate delay (~ 80 ms), but higher than RL-based model. | Achieves the lowest delay (~ 40 ms) across all node densities (Figure 5). | Superior performance in minimizing delay, making it ideal for time-sensitive IoT applications. | None observed for this metric. |
| Packet Delivery Ratio (PDR) | Similar PDR (~ 85%), but less consistent in dense networks. | Slightly higher PDR (~ 87%) compared to RL-based model. | Competitive PDR (~ 85%) across all node densities (Figure 6). | Maintains high reliability in data delivery while optimizing other metrics. | Marginal improvement over existing models; limited novelty in this aspect. |
| Packet Error Rate (PER) | Highest PER (~ 15%), especially in dense networks. | Moderate PER (~ 10%), higher than RL-based model. | Lowest PER (~ 5%) across all node densities (Figure 7). | Demonstrates robustness in maintaining data integrity under varying network conditions. | None observed for this metric. |
| Throughput | Lowest throughput (~ 150 kbps), especially in dense networks. | Moderate throughput (~ 200 kbps), lower than RL-based model. | Achieves the highest throughput (~ 250 kbps) across all node densities (Figure 8). | Maximizes network efficiency through intelligent resource allocation and scheduling decisions. | None observed for this metric. |
This study presents a reinforcement learning (RL)-based task scheduling algorithm designed to enhance Quality of Service (QoS) in LoRaWAN-based IoT networks. By leveraging a Deep Q-Network (DQN), the proposed model dynamically learns and adapts to changing network conditions, effectively optimizing key performance metrics such as delay, throughput, packet delivery ratio (PDR), and packet error rate (PER). Unlike traditional scheduling techniques, which often rely on static heuristics or centralized logic, the RL agent autonomously adjusts its actions through continuous interaction with the network environment, demonstrating superior responsiveness and scalability. The proposed solution marks a significant step forward in intelligent IoT network management by introducing a context-aware and self-optimizing scheduler. It highlights the potential of RL to serve as a flexible and powerful alternative to conventional methods, especially in resource-constrained and dynamic LPWAN environments. This adaptability is critical for supporting the next generation of IoT applications, including smart cities, agriculture, industrial monitoring, and remote healthcare.
Despite its promising performance, this work has several limitations. The evaluation was conducted exclusively in a simulated environment using NS-3, which, while realistic, cannot fully capture the unpredictability of real-world deployments, including environmental interference, hardware variability, and packet collisions from non-LoRaWAN devices. The model assumes a stable network topology and complete observability, which may not be feasible in large-scale or mobile settings. Additionally, the algorithm does not explicitly incorporate energy efficiency into its decision-making process—an essential factor in battery-powered LPWAN applications. Addressing these limitations through real-world validation, energy-aware policy design, and support for partially observable or mobile scenarios remains an important area for future work.
To further enhance the effectiveness and applicability of the proposed RL-based task scheduling algorithm for LoRaWAN, several directions are recommended. Future research should explore advanced reinforcement learning techniques such as Double DQN, Dueling DQN, and Proximal Policy Optimization (PPO) to improve learning stability and decision accuracy in dynamic environments. Additionally, incorporating energy consumption into the reward function is essential to extend the operational lifetime of battery-powered IoT nodes. Expanding the model to support multi-gateway and mobile node scenarios through multi-agent reinforcement learning (MARL) can improve scalability and adaptability in real-world deployments. Addressing partial observability using RNNs or POMDP-based methods will also enhance robustness in large-scale networks with incomplete information. Finally, implementing and evaluating the algorithm in real-world testbeds like The Things Network (TTN) or ChirpStack will provide valuable insights into practical challenges such as interference, hardware limitations, and protocol integration.
SignUp to our
Content alerts.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Are you the author of a recent Preprint? We invite you to submit your manuscript for peer-reviewed publication in our open access journal.
Benefit from fast review, global visibility, and exclusive APC discounts.