基于半监督图表示学习的恶意节点检测方法
DOI:
作者:
作者单位:

西安交通大学 软件学院

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金项目(面上项目,重点项目,重大项目)


Malicious Node Detection Based on Semi-supervised Graph Representation Learning
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    现实场景中,电子商务、消费点评、社交网络等不同平台用户之间往往存在着丰富的交互关系,将其构建成图结构,并基于图神经网络GNN进行恶意用户检测已成为相关领域近几年的研究趋势.然而,由于恶意用户通常占比较小且存在伪装和标记成本高的情况,导致了数据不平衡、数据不一致和标签稀缺等问题,从而使传统GNN方法的效果受到了一定的限制.本文提出基于半监督图表示学习的恶意节点检测方法,该方法通过改进的GNN方法进行图节点表示学习并对图中节点分类.具体地,构造类别感知的恶意节点检测方法(class-aware malicious node detection, CAMD),该方法引入类别感知注意力系数、不一致图神经网络编码器、类别感知不平衡损失函数以解决数据不一致与不平衡问题.接下来,针对CAMD在标签稀缺情况下检测效果受限的问题,提出了基于图对比学习的方法CAMD+,引入数据增强、自监督图对比学习及类别感知图对比学习,使模型可以从未标记的数据中学习更多信息并充分利用稀缺的标签信息.最后,真实数据集上的大量实验结果验证了所提方法优于所有基线方法,且在不同程度的标签稀缺情况下都表现出良好的检测效果.

    Abstract:

    In real-world scenarios, there are rich interaction relationships among users on different platforms such as e-commerce, consumer reviews, and social networks. Constructing these relationships into a graph structure and applying graph neural networks (GNNs) for malicious user detection has become a research trend in related fields in recent years. However, due to the small proportion of malicious users, as well as their disguising and high labeling cost, traditional GNN methods are limited by the problems of data imbalance, data inconsistency, and label scarcity. This study proposes a semi-supervised graph representation learning-based method for detecting malicious nodes. The method improves the GNN method for node representation learning and classification. Specifically, a class-aware malicious node detection (CAMD) method is constructed, which introduces class-aware attention mechanism, inconsistent GNN encoder, and class-aware imbalance loss functions to solve the problems of data inconsistency and imbalance. Furthermore, to address the limitation of CAMD in detecting malicious nodes with scarce labels, a graph contrastive learning-based method CAMD+ is proposed, which introduces data augmentation, self-supervised graph contrastive learning, and class-aware graph contrastive learning to enable the model to learn more information from unlabeled data and fully utilize scarce label information. Finally, a large number of experimental results on real-world datasets verify that the proposed methods outperform all baseline methods and show good detection performance in situations with different degrees of label scarcity.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-05-10
  • 最后修改日期:2023-11-29
  • 录用日期:2024-04-19
  • 在线发布日期:
  • 出版日期:
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号