基于标记因果顺序挖掘的多标记分类方法
作者:
作者单位:

作者简介:

陈加略(1991-),男,博士,CCF学生会员,主要研究领域为机器学习,数据挖掘;
姜远(1976-),女,博士,教授,博士生导师,CCF专业会员,主要研究领域为机器学习,数据挖掘.

通讯作者:

姜远,E-mail:jiangyuan@nju.edu.cn

中图分类号:

基金项目:

国家自然科学基金(61673201,61921006)


Multi-label Learning by Exploiting Causal Order of Labels
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在多标记学习(MLL)问题中,每个示例都与一组标记相关联.为了实现对未见示例的高效预测,挖掘和利用标记之间的关系是至关重要的.大多数已有的研究都将关系简化为标记之间的相关性,而相关性又通常基于标记的共现性.揭示了因果关系对于描述一个标记在学习过程中如何帮助另一个标记更为重要.基于这一观察,提出了两种策略来从标记因果有向无环图(DAG)中生成标记的因果顺序,同时使得生成的因果顺序都遵循因标记应该在果标记之前的准则.第1种策略的主要思想是对随机顺序进行排序,使其满足DAG中的因果关系.而第2种策略的主要思想是根据DAG的结构,将标记放入许多不相交的拓扑层次中,再通过它们的拓扑结构进行排序.进一步,通过将因果顺序纳入到分类器链(CC)模型中,提出了一种有效的MLL方法,从而从更加本质的角度来利用标记关系.在多个数据集上的实验结果验证了该方法确实能够挖掘出有效的标记因果顺序,并帮助提升学习性能.

    Abstract:

    In multi-label learning (MLL) problems, each example is associated with a set of labels. In order to train a well-performed predictor for unseen examples, exploiting relations between labels is crucially important. Most exiting studies simplify the relation as correlations among labels, typically based on their co-occurrence. This study discloses that causal relations are more essential for describing how a label can help another one during the learning process. Based on this observation, two strategies are proposed to generate causal orders of labels from the label causal directed acyclic graph (DAG), following the constraint that the cause label should be prior to the effect label. The main idea of the first strategy is to sort a random order to make it satisfied the cause-effect relations in DAG. And the main idea of the second strategy is to put labels into many non-intersect topological levels based on the structure of the DAG, then sort these labels through their topological structure. Further, by incorporating the causal orders into the classifier chain (CC) model, an effective MLL approach is proposed to exploit the label relation from a more essential view. Experiments results on multiple datasets validate that the extracted causal order of labels indeed provides helpful information to boost the performance.

    参考文献
    相似文献
    引证文献
引用本文

陈加略,姜远.基于标记因果顺序挖掘的多标记分类方法.软件学报,2022,33(4):1267-1273

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-05-29
  • 最后修改日期:2021-07-16
  • 录用日期:
  • 在线发布日期: 2021-10-26
  • 出版日期: 2022-04-06
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号