Abstract:In battlefield communication confrontation, the rational allocation of interference parameters has always been a challenging task. Based on Deep Reinforcement Learning (DRL), this paper allocates the interference power, interference waveform, and interference target for the jammer, saving resource consumption and improving resource utilization while ensuring the effectiveness of interference. Specifically, the problem of interference parameter allocation is constructed as a fully cooperative multi-agent task. The SA(Stochastic Attention)-QMIX(Q-value based Mixing) algorithm is adopted to mitigate the issue of high decision-making dimensionality in multi-agent scenarios. By introducing the maximum entropy method and multi-head attention mechanism into the QMIX algorithm, the agents can make more effective collaborative decisions in partially observable environments. Simulation results show that when using the SA-QMIX algorithm for interference parameter allocation, compared with the traditional QMIX algorithm, it can increase the interference success rate by 5% while reducing the interference power by 1.5 dB. Moreover, the algorithm in this paper can converge faster, with the convergence speed being improved by approximately 40%.