数据显隐性关系驱动的敏感数据泄露风险预测
作者:
作者单位:

1.国网重庆市电力公司,电力科学研究院,重庆 401123,重庆 400014;2.国网重庆市电力公司,数字化部,重庆 400014

作者简介:

梁 花(1990-),女,硕士,工程师,主要研究方向为数据安全防护体系.email:stategridcqlianghua@outlook.com.
靳 敏(1989-),女,硕士,工程师,主要研究方向为网络安全攻防技术与体系建设.
严 华(1986-),男,学士,工程师,主要研究方向为数据安全、存储介质安全.
韩世海(1975-),男,学士,高级工程师,主要研究方向为网络安全攻击技术.
李 玮(1989-),男,学士,工程师,主要研究方向为数据安全救援与销毁技术.

通讯作者:

基金项目:

国网重庆市电力公司重点研发项目(2023渝电科技59号)

伦理声明:



Sensitive data leakage risk prediction driven by data explicit and implicit relationships
Author:
Ethical statement:

Affiliation:

1.Electric Power Research Institute, Chongqing 401123, China, Chongqing 400014,China, State Grid Chongqing Electric Power Company;2.Digitization Department, Chongqing 400014,China, State Grid Chongqing Electric Power Company

Funding:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    随着物联网(IoT)、大数据以及人工智能(AI)技术的快速发展,海量数据正在以前所未有的规模被生成和利用。这些数据中包含大量敏感信息,如何安全存储敏感数据成为亟待解决的现实问题。现有的数据存储方案通常侧重于敏感数据的直接保护,忽视了敏感数据与非敏感数据之间显性和隐性关联所带来的泄露风险。为此,本文从信息熵的角度深入分析了数据间的显性和隐性关系,提出一种快速评估显隐性关系并预测敏感数据泄露风险的方法。通过引入信息提升比(LR)和信息掌握概率(PIC),能够有效识别非敏感数据对敏感数据泄露风险的影响。仿真实验中,统计特性数据集(SPD)中的单属性LR最大为0.308,联合属性LR可提升至0.891;敏感数据泄露风险的检测概率显著提高,最高达到23.2%。仿真结果表明,该方法能够有效识别并应对因显隐性关系带来的安全风险,显著提升敏感数据存储的整体安全水平。

    Abstract:

    With the rapid development of Internet of Things(IoT), big data, and Artificial Intelligence(AI) technologies, massive amounts of data are being generated and utilized on an unprecedented scale. These data contain a large amount of sensitive information, and how to securely store sensitive data has become a realistic problem that needs to be solved. The existing data storage schemes usually focus on the direct protection of sensitive data, while ignoring the leakage risks associated with explicit and implicit associations between sensitive and non-sensitive data. The explicit and implicit relationships among data are deeply analyzed from the perspective of information entropy, and a method is proposed to quickly assess the explicit and implicit relationships and predict the leakage risk of sensitive data. By introducing the information Lift Ratio(LR) and the Probability of Information Control(PIC), the method can effectively identify the influence of non-sensitive data on the risk of sensitive data leakage. In the simulation experiments, the maximum single-attribute LR in the Statistical Property Dataset(SPD) is 0.308, and the joint-attribute LR can be up to 0.891, and the predicted value of the sensitive data leakage risk is significantly improved, up to 23.2%. The simulation results show that the method can effectively identify and cope with the security risks caused by explicit and implicit relationships, thus significantly improving the overall security level of sensitive data storage.

    参考文献
    相似文献
    引证文献
引用本文

梁花,靳敏,严华,韩世海,李玮.数据显隐性关系驱动的敏感数据泄露风险预测[J].太赫兹科学与电子信息学报,2025,23(5):482~488

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
历史
  • 收稿日期:2024-08-16
  • 最后修改日期:2024-10-17
  • 录用日期:
  • 在线发布日期: 2025-06-05
  • 出版日期:
关闭