Matching Combined Heterogeneous Multi-Agent Reinforcement Learning for Resource Allocation in NOMA-V2X Networks
Authors: Gao, A., Zhu, Z., Zhang, J., Liang, W. and Hu, Y.
Journal: IEEE Transactions on Vehicular Technology
Volume: 73
Issue: 10
Pages: 15109-15124
eISSN: 1939-9359
ISSN: 0018-9545
DOI: 10.1109/TVT.2024.3409048
Abstract:To combat the spectrum scarcity, non-orthogonal multiple access (NOMA) and vehicle-to-everything (V2X) systems are integrated as NOMA-V2X networks, where multiple vehicle-to-vehicle (V2V) links can opportunistically reuse the spectrum licensed to vehicle-to-infrastructure (V2I) links. However, the contradictory quality of service (QoS) requirements among V2V and V2I links make the design of an effective spectrum sharing scheme to be a great challenge. The high mobility of vehicles and the extra interference brought by NOMA make the issue more complex. Hence, a matching combined multi-agent deep deterministic policy gradient (MADDPG) algorithm is proposed in the paper to maximize the sum delivery rate of V2I up-links while guaranteeing the reliability of V2V links in NOMA-V2X networks. In specific, the channel assignment is solved by one-to-many matching in advance which can be theoretically proved to converge to a stable state. While heterogeneous MADDPG is further adopted to obtain the proper power control for V2I link pairs and V2V links which are taken as different types of agents and interact with the environment independently. On the basis, a fully decentralized framework is designed for the proposed algorithm to reduce the communication overhead caused by the information synchronization. Simulation results demonstrate that by introducing matching theory into deep reinforcement learning (DRL), the sum delivery rate of V2I links can be greatly improved with less computation complexity and convergence time. Moreover, compared with orthogonal multiple access (OMA) communications, the outstanding energy and spectrum efficiency make it significant to explore NOMA in V2X networks.
https://eprints.bournemouth.ac.uk/39914/
Source: Scopus
Matching Combined Heterogeneous Multi-Agent Reinforcement Learning for Resource Allocation in NOMA-V2X Networks
Authors: Gao, A., Zhu, Z., Zhang, J., Liang, W. and Hu, Y.
Journal: IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY
Volume: 73
Issue: 10
Pages: 15109-15124
eISSN: 1939-9359
ISSN: 0018-9545
DOI: 10.1109/TVT.2024.3409048
https://eprints.bournemouth.ac.uk/39914/
Source: Web of Science (Lite)
Matching Combined Heterogeneous Multi-Agent Reinforcement Learning for Resource Allocation in NOMA-V2X Networks
Authors: Gao, A., Zhu, Z., Zhang, J., Liang, W. and Hu, Y.
Journal: IEEE Transactions on Vehicular Technology
Publisher: IEEE
ISSN: 0018-9545
https://eprints.bournemouth.ac.uk/39914/
Source: Manual
Matching Combined Heterogeneous Multi-Agent Reinforcement Learning for Resource Allocation in NOMA-V2X Networks
Authors: Gao, A., Zhu, Z., Zhang, J., Liang, W. and Hu, Y.
Journal: IEEE Transactions on Vehicular Technology
Volume: 73
Issue: 10
Pages: 15109-15124
Publisher: IEEE
ISSN: 0018-9545
Abstract:To combat the spectrum scarcity, non-orthogonal multiple access (NOMA) and vehicle-to-everything (V2X) systems are integrated as NOMA-V2X networks, where multiple vehicleto-vehicle (V2V) links can opportunistically reuse the spectrum licensed to vehicle-to-infrastructure (V2I) links. However, the contradictory quality of service (QoS) requirements among V2V and V2I links make the design of an effective spectrum sharing scheme to be a great challenge. The high mobility of vehicles and the extra interference brought by NOMA make the issue more complex. Hence, a matching combined multi-agent deep deterministic policy gradient (MADDPG) algorithm is proposed in the paper to maximize the sum delivery rate of V2I up-links while guaranteeing the reliability of V2V links in NOMA-V2X networks. In specific, the channel assignment is solved by oneto-many matching in advance which can be theoretically proved to converge to a stable state. While heterogeneous MADDPG is further adopted to obtain the proper power control for V2I link pairs and V2V links which are taken as different types of agents and interact with the environment independently. On the basis, a fully decentralized framework is designed for the proposed algorithm to reduce the communication overhead caused by the information synchronization. Simulation results demonstrate that by introducing matching theory into deep reinforcement learning (DRL), the sum delivery rate of V2I links can be greatly improved with less computation complexity and convergence time. Moreover, compared with orthogonal multiple access (OMA) communications, the outstanding energy and spectrum efficiency make it significant to explore NOMA in V2X networks
https://eprints.bournemouth.ac.uk/39914/
Source: BURO EPrints