###
DOI:
Journal of Software:2009.20(7):1735-1745

潜在属性空间树分类器
何萍,徐晓华,陈崚
(南京航空航天大学 信息科学与技术学院 计算机科学与工程系,江苏 南京 210016;扬州大学 信息工程学院 计算机科学与工程系,江苏 扬州 225009)
Latent Attribute Space Tree Classifiers
()
Abstract
Chart / table
Reference
Similar Articles
Article :Browse 4015   Download 4373
Received:May 28, 2007    Revised:March 06, 2008
> 中文摘要: 提出一种潜在属性空间树分类器(latent attribute space tree classifier,简称LAST)框架,通过将原属性空间变换到更容易分离数据或更符合决策树分类特点的潜在属性空间,突破传统决策树算法的决策面局限,改善树分类器的泛化性能.在LAST 框架下,提出了两种奇异值分解斜决策树(SVD (singular value decomposition) oblique decision tree,简称SODT)算法,通过对全局或局部数据进行奇异值分解,构建正交的潜在属性空间,然后在潜在属性空间内构建传统的单变量决策树或树节点,从而间接获得原空间内近似最优的斜决策树.SODT 算法既能够处理整体数据与局部数据分布相同或不同的数据集,又可以充分利用有标签和无标签数据的结构信息,分类结果不受样本随机重排的影响,而且时间复杂度还与单变量决策树算法相同.在复杂数据集上的实验结果表明,与传统的单变量决策树算法和其他斜决策树算法相比,SODT 算法的分类准确率更高,构建的决策树大小更稳定,整体分类性能更鲁棒,决策树构建时间与C4.5 算法相近,而远小于其他斜决策树算法.
Abstract:A framework of latent attribute space tree classifier (LAST) is proposed in this paper. LAST transforms data from the original attribute space into the latent attribute space, which is easier for data separation or more suitable for tree classifier, so that the decision boundary of the traditional decision tree can be extended and its generalization ability can be improved. This paper presents two SVD (singular value decomposition) oblique decision tree (SODT) algorithms based on the LAST framework. SODT first performs SVD on global and/or local data to construct orthogonal latent attribute space. Then, traditional decision tree or tree nodes are built in that space.Finally, SODT obtains the approximately optimal oblique decision tree of the original space. SODT can not only handle datasets with similar or different distribution between global and local data, but also can make full use of the structure information of the labelled and unlabelled data and produce the same classification results no matter how the observations are arranged. Besides, the time complexity of SODT is identical to that of the univariate decision tree. Experimental results show that compared with the traditional univariate decision tree algorithm C4.5 and the oblique decision tree algorithms OC1 and CART-LC, SODT gives higher classification accuracy, more stable decision tree size and comparable tree-construction time as C4.5, which is much less than that of OC1 and CART-LC.
文章编号:     中图分类号:    文献标志码:
基金项目:Supported by the National Natural Science Foundation of China under Grant No.60673060 (国家自然科学基金); the Natural Science Foundation of Jiangsu Province of China under Grant No.BK2008206 (江苏省自然科学基金) Supported by the National Natural Science Foundation of China under Grant No.60673060 (国家自然科学基金); the Natural Science Foundation of Jiangsu Province of China under Grant No.BK2008206 (江苏省自然科学基金)
Foundation items:
Reference text:

何 萍,徐晓华,陈 崚.潜在属性空间树分类器.软件学报,2009,20(7):1735-1745

.Latent Attribute Space Tree Classifiers.Journal of Software,2009,20(7):1735-1745