基于强化学习的分布式协同干扰决策算法

doi:10.11805/TKYDA2024076

首页 > 按期查看>2025年第6期 >631-639. DOI:10.11805/TKYDA2024076

基于强化学习的分布式协同干扰决策算法
DOI:
                        10.11805/TKYDA2024076
                    
作者:
                        
                        
                    
作者单位:哈尔滨工程大学 信息与通信工程学院，黑龙江 哈尔滨 150001
作者简介:艾佳俊(1998-)，女，在读硕士研究生，主要研究方向为通信对抗、强化学习.email:aijiajun@hrbeu.edu.cn.
刘 秦(1998-)，女，在读硕士研究生，主要研究方向为通信对抗.
李国庆(2000-)，男，在读硕士研究生，主要研究方向为通信对抗.
侯长波(1986-)，男，硕士，副教授，主要研究方向为人工智能与边缘计算、电磁信号智能识别与干扰、图像处理与应用.
通讯作者:
基金项目:国家自然科学基金资助项目(U23A20271)
伦理声明:

A distributed interference decision algorithm based on reinforcement learning

Author:

Ethical statement:

Affiliation:

College of Information and Communication Engineering，Harbin Engineering University，Harbin Heilongjiang 150001，China

Funding:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

摘要:

在战场通信对抗中，干扰参数的合理分配一直是一项具有挑战性的任务。本文基于深度强化学习(DRL)对干扰方的干扰功率、干扰波形和干扰目标进行分配，在保证干扰效果的前提下节省资源消耗，提高资源的利用率。具体地，将干扰参数分配问题构建为完全协作的多智能体任务，采用集中式训练、分布式决策的基于最大熵和注意力机制的QMIX(SA-QMIX)算法缓解多智能体决策维度高的问题，通过在QMIX算法中引入最大熵方法和多头注意力机制，使智能体在部分可观测环境下更有效地协同决策。仿真结果表明，采用SA-QMIX算法进行干扰参数分配时，相比传统的QMIX算法，能够在减少1.5 dB的干扰功率前提下，增加5%的干扰成功率，且本文算法能够更快地收敛，收敛速度可提升大约40%。

Abstract:

In battlefield communication confrontation, the rational allocation of interference parameters has always been a challenging task. Based on Deep Reinforcement Learning (DRL), this paper allocates the interference power, interference waveform, and interference target for the jammer, saving resource consumption and improving resource utilization while ensuring the effectiveness of interference. Specifically, the problem of interference parameter allocation is constructed as a fully cooperative multi-agent task. The SA(Stochastic Attention)-QMIX(Q-value based Mixing) algorithm is adopted to mitigate the issue of high decision-making dimensionality in multi-agent scenarios. By introducing the maximum entropy method and multi-head attention mechanism into the QMIX algorithm, the agents can make more effective collaborative decisions in partially observable environments. Simulation results show that when using the SA-QMIX algorithm for interference parameter allocation, compared with the traditional QMIX algorithm, it can increase the interference success rate by 5% while reducing the interference power by 1.5 dB. Moreover, the algorithm in this paper can converge faster, with the convergence speed being improved by approximately 40%.

参考文献

相似文献

引证文献

引用本文

艾佳俊,刘秦,李国庆,侯长波.基于强化学习的分布式协同干扰决策算法[J].太赫兹科学与电子信息学报,2025,23(6):631~639

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:

历史

收稿日期:2024-01-29
最后修改日期:2024-03-02
录用日期:
在线发布日期: 2025-07-01
出版日期:

首页

期刊简介

投稿必读

征订启事

编委会

联系我们

ENGLISH

编委风采

出版道德

太赫兹专委会

引用本文

分享

文章指标

历史