Matching Combined Heterogeneous Multi-Agent Reinforcement Learning for Resource Allocation in NOMA-V2X Networks

Authors: Gao, A., Zhu, Z., Zhang, J., Liang, W. and Hu, Y.

Journal: IEEE Transactions on Vehicular Technology

Volume: 73

Issue: 10

Pages: 15109-15124

eISSN: 1939-9359

ISSN: 0018-9545

DOI: 10.1109/TVT.2024.3409048

Abstract:

To combat the spectrum scarcity, non-orthogonal multiple access (NOMA) and vehicle-to-everything (V2X) systems are integrated as NOMA-V2X networks, where multiple vehicle-to-vehicle (V2V) links can opportunistically reuse the spectrum licensed to vehicle-to-infrastructure (V2I) links. However, the contradictory quality of service (QoS) requirements among V2V and V2I links make the design of an effective spectrum sharing scheme to be a great challenge. The high mobility of vehicles and the extra interference brought by NOMA make the issue more complex. Hence, a matching combined multi-agent deep deterministic policy gradient (MADDPG) algorithm is proposed in the paper to maximize the sum delivery rate of V2I up-links while guaranteeing the reliability of V2V links in NOMA-V2X networks. In specific, the channel assignment is solved by one-to-many matching in advance which can be theoretically proved to converge to a stable state. While heterogeneous MADDPG is further adopted to obtain the proper power control for V2I link pairs and V2V links which are taken as different types of agents and interact with the environment independently. On the basis, a fully decentralized framework is designed for the proposed algorithm to reduce the communication overhead caused by the information synchronization. Simulation results demonstrate that by introducing matching theory into deep reinforcement learning (DRL), the sum delivery rate of V2I links can be greatly improved with less computation complexity and convergence time. Moreover, compared with orthogonal multiple access (OMA) communications, the outstanding energy and spectrum efficiency make it significant to explore NOMA in V2X networks.

https://eprints.bournemouth.ac.uk/39914/

Source: Scopus

Matching Combined Heterogeneous Multi-Agent Reinforcement Learning for Resource Allocation in NOMA-V2X Networks

Authors: Gao, A., Zhu, Z., Zhang, J., Liang, W. and Hu, Y.

Journal: IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY

Volume: 73

Issue: 10

Pages: 15109-15124

eISSN: 1939-9359

ISSN: 0018-9545

DOI: 10.1109/TVT.2024.3409048

https://eprints.bournemouth.ac.uk/39914/

Source: Web of Science (Lite)

Matching Combined Heterogeneous Multi-Agent Reinforcement Learning for Resource Allocation in NOMA-V2X Networks

Authors: Gao, A., Zhu, Z., Zhang, J., Liang, W. and Hu, Y.

Journal: IEEE Transactions on Vehicular Technology

Publisher: IEEE

ISSN: 0018-9545

https://eprints.bournemouth.ac.uk/39914/

Source: Manual

Matching Combined Heterogeneous Multi-Agent Reinforcement Learning for Resource Allocation in NOMA-V2X Networks

Authors: Gao, A., Zhu, Z., Zhang, J., Liang, W. and Hu, Y.

Journal: IEEE Transactions on Vehicular Technology

Volume: 73

Issue: 10

Pages: 15109-15124

Publisher: IEEE

ISSN: 0018-9545

Abstract:

To combat the spectrum scarcity, non-orthogonal multiple access (NOMA) and vehicle-to-everything (V2X) systems are integrated as NOMA-V2X networks, where multiple vehicleto-vehicle (V2V) links can opportunistically reuse the spectrum licensed to vehicle-to-infrastructure (V2I) links. However, the contradictory quality of service (QoS) requirements among V2V and V2I links make the design of an effective spectrum sharing scheme to be a great challenge. The high mobility of vehicles and the extra interference brought by NOMA make the issue more complex. Hence, a matching combined multi-agent deep deterministic policy gradient (MADDPG) algorithm is proposed in the paper to maximize the sum delivery rate of V2I up-links while guaranteeing the reliability of V2V links in NOMA-V2X networks. In specific, the channel assignment is solved by oneto-many matching in advance which can be theoretically proved to converge to a stable state. While heterogeneous MADDPG is further adopted to obtain the proper power control for V2I link pairs and V2V links which are taken as different types of agents and interact with the environment independently. On the basis, a fully decentralized framework is designed for the proposed algorithm to reduce the communication overhead caused by the information synchronization. Simulation results demonstrate that by introducing matching theory into deep reinforcement learning (DRL), the sum delivery rate of V2I links can be greatly improved with less computation complexity and convergence time. Moreover, compared with orthogonal multiple access (OMA) communications, the outstanding energy and spectrum efficiency make it significant to explore NOMA in V2X networks

https://eprints.bournemouth.ac.uk/39914/

Source: BURO EPrints