Fast Unsupervised Dimension Reduction Method Based on Maximum Entropy
Author:
Affiliation:

Clc Number:

TP18

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    High-dimensional data is widely adopted in the real world. However, there is usually plenty of redundant and noisy information existing in high-dimensional data, which accounts for the poor performance of many traditional clustering algorithms when clustering high-dimensional data. In practice, it is found that the cluster structure of high-dimensional data is often embedded in the lower dimensional subspace. Therefore, dimension reduction becomes the key technology of mining high-dimensional data. Among many dimension reduction methods, graph-based method becomes a research hotspot. However, most graph-based dimension reduction algorithms suffer from the following two problems: (1) most of the graph-based dimension reduction algorithms need to calculate or learn adjacency graphs, which have high computational complexity; (2) the purpose of dimension reduction is not considered in the process of dimension reduction. To address the problem, a fast unsupervised dimension reduction algorithm is proposed based on the maximum entropy-MEDR, which combines linear projection and the maximum entropy clustering model to find the potential optimal cluster structure of high-dimensional data embedded in low-dimensional subspace through an effective iterative optimization algorithm. The MEDR algorithm does not need the adjacency graph as an input in advance, and has linear time complexity of input data scale. A large number of experimental results on real datasets show that the MEDR algorithm can find a better projection matrix to project high-dimensional data into low-dimensional subspace compared with the traditional dimensionality reduction method, so that the projected data is conducive to clustering analysis.

    Reference
    Related
    Cited by
Get Citation

王继奎,杨正国,刘学文,易纪海,李冰,聂飞平.一种基于极大熵的快速无监督线性降维方法.软件学报,2023,34(4):1779-1795

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:February 22,2021
  • Revised:May 19,2021
  • Adopted:
  • Online: June 15,2022
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063