###
Journal of Software:2017.28(11):2814-2824

不一致数据上精确决策树生成算法
王鹤澎,王宏志,李建中,高宏
(哈尔滨工业大学 计算机科学与技术系, 黑龙江 哈尔滨 150006)
Algorithms for Accurate Decision Tree Generation on Inconsistent Data
WANG He-Peng,WANG Hong-Zhi,LI Jian-Zhong,GAO Hong
(Department of Computer Science and Technology, Harbin Institute of Technology, Harbin 150006, China)
Abstract
Chart / table
Reference
Similar Articles
Article :Browse 982   Download 859
Received:April 15, 2017    Revised:June 16, 2017
> 中文摘要: 近年来,随着现实生活中数据量的不断增大,不一致数据的出现也越发频繁,这使得人工修正不一致数据变得更加耗时.而且,人工修正数据方法本身也存在着不可避免的人为操作错误,因此,这种修正方法不再可行.如何不提前修复不一致数据,直接在不一致数据上进行分类,是该文的核心研究内容.对决策树生成算法的目标函数进行改进,使其能够直接对不一致数据进行分类,并得到较好的分类结果.对约束条件中的特征对分类结果的影响进行了多方面衡量,从而调整该特征的影响因子,使得决策树的节点分割更加精确,分类效果更优.
中文关键词: 不一致数据  决策树  分类  海量数据
Abstract:In recent years, with the increasing amount of data in real life, inconsistent data becomes more frequent. This makes manual correction of inconsistent data more time-consuming. Moreover, manual correction prone to human errors. Thus, such correction method is no longer feasible. How to perform classification directly on inconsistent data without correcting data beforehand is the core research content of this paper. In this paper, the objective function of the decision tree generation algorithm is improved so that it can directly classify inconsistent data and achieve better results. Multidimensional measures of the influence of the feature are used on classification results to adjust the influence factor of the feature so that nodes of the decision tree can be split more accurate to achieve more effective classification results.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(U1509216,61472099);国家科技支撑计划(2015BAH10F01) 国家自然科学基金(U1509216,61472099);国家科技支撑计划(2015BAH10F01)
Foundation items:National Natural Science Foundation of China (U1509216, 61472099); National Key Technology R&D Program of China (2015BAH10F01)
Reference text:

王鹤澎,王宏志,李建中,高宏.不一致数据上精确决策树生成算法.软件学报,2017,28(11):2814-2824

WANG He-Peng,WANG Hong-Zhi,LI Jian-Zhong,GAO Hong.Algorithms for Accurate Decision Tree Generation on Inconsistent Data.Journal of Software,2017,28(11):2814-2824