| P.O.Box 8718, Beijing 100080, China | Journal of Software, Aug. 2005,16(8):1423-1430 |
| E-mail: jos@iscas.ac.cn | ISSN 1000-9825, CODEN RUXUEW, CN 11-2560/TP |
| http://www.jos.org.cn | Copyright © 2005 by The Editorial Department of Journal of Software |
高维数据流形的低维嵌入及嵌入维数研究
赵连伟, 罗四维, 赵艳敞, 刘蕴辉
Abstract
Finding meaningful low-dimensional embedded in a high-dimensional space is a classical problem. Isomap is a nonlinear dimensionality reduction method proposed and based on the theory of manifold. It not only can reveal the meaningful low-dimensional structure hidden in the high-dimensional observation data, but can recover the underlying parameter of data lying on a low-dimensional submanifold. Based on the hypothesis that there is an isometric mapping between the data space and the parameter space, Isomap works, but this hypothesis has not been proved. In this paper, the existence of isometric mapping between the manifold in the high-dimensional data space and the parameter space is proved. By distinguishing the intrinsic dimensionality of high-dimensional data space from the manifold dimensionality, and it is proved that the intrinsic dimensionality is the upper bound of the manifold dimensionality in the high-dimensional space in which there is a toroidal manifold. Finally an algorithm is proposed to find the underlying toroidal manifold and judge whether there exists one. The results of experiments on the multi-pose three-dimensional object show that the method is effective.
Zhao LW, Luo SW, Zhao YC, Liu YH. Study on the low-dimensional embedding and the embedding dimensionality of manifold of high-dimensional data.
Journal of Software, 2005,16(8):1423-1430.
DOI:
10.1360/jos161423
http://www.jos.org.cn/1000-9825/16/1423.htm
摘要
发现高维数据空间流形中有意义的低维嵌入是一个经典难题.Isomap是提出的一种有效的基于流形理论的非线性降维方法,它不仅能够揭示高维观察数据的内在结构,还能够发现潜在的低维参数空间.Isomap的理论基础是假设在高维数据空间和低维参数空间存在等距映射,但并没有进行证明.首先给出了高维数据的连续流形和低维参数空间之间的等距映射存在性证明,然后区分了嵌入空间维数、高维数据空间的固有维数和流形维数,并证明存在环状流形高维数据空间的参数空间维数小于嵌入空间维数.最后提出一种环状流形的发现算法,判断高维数据空间是否存在环状流形,进而估计其固有维数及潜在空间维数.在多姿态三维对象的实验中证明了算法的有效性,并得到正确的低维参数空间.
基金项目:Supported by the National Natural Science Foundation of China under Grant No.60373029 (国家自然科学基金)
References:
[1] Sebastian HS, Lee DD. The manifold ways of perception. Science, 2000,290(12):2268-2269.
[2] Roweis ST, Saul LK. Nonlinear dimensionality analysis by locally linear embedding. Science, 2000,290(12):2323-2326.
[3] Tenenbaum JB, de Silva V, Langford JC. A global geometric framework for nonlinear dimensionality reduction. Science, 2000, 290(12):2319-2323.
[4] Donoho DL, Grimes C. When does ISOMAP recover the natural parameterization of families of articulated images? Technical Report, 2002-27, Department of Statistics, Stanford University, 2002.
[5] Donoho DL, Grimes C. Hessian eigenmaps: New locally linear embedding techniques for high-dimensional data. Proc. of the National Academy of Sciences, 2003,100(10):5591-5596.
[6] Zhang CS, Wang J, Zhao NY, Zhang D. Reconstruction and analysis of multi-pose face images based on nonlinear dimensionality reduction. Pattern Recognition, 2004,37(1):325-336.
[7] Polito M, Perona P. Grouping and dimensionality reduction by locally linear embedding. Neural Inform Process Systems, 2001, 1255-1262.
[8] Lee MD. Determining the dimensionality of multidimensional scaling models for cognitive modeling. Journal of Mathematical Psychology, 2001,45(4):149-166.
[9] Camastra F. Data dimensionality estimation methods: A survey. Pattern Recognition, 2003,36:2945-2954.
[10] Liu XW, Srivastavab A, Wang DL. Intrinsic generalization analysis of low dimensional representations. Neural Networks, 2003,16: 537-545.
[11] Camastra F, Vinciarelli A. Estimating the intrinsic dimension of data with a fractal-based method. IEEE Trans. on Pattern Analysis, 2002,24(10):1404-1407.
[12] Pless R, Simon I. Embedding images in non-flat spaces. Technical Report, WU-CS-01-43, Washington University, 2001.