高维数据聚类可视分析方法综述

doi:10.11996/JG.j.2095-302X.2020010044

图学学报

• 计算机图形学与虚拟现实 • 上一篇下一篇

高维数据聚类可视分析方法综述

1. 北京工商大学计算机与信息工程学院食品安全大数据技术北京市重点实验室，北京 100048；
2. 武汉理工大学信息工程学院，湖北武汉 430070

出版日期:2020-02-29 发布日期:2020-03-11
基金资助:
国家重点研发计划资助项目(2018YFC1603602)；国家自然科学基金项目(61972010)；国家科技基础性工作专项(2015FY111200)

Overviewing of visual analysis approaches for clustering high-dimensional data

1. Beijing Key Laboratory of Big Data Technology for Food Safety, Beijing Technology and Business University, Beijing 100048, China;
2. School of Information Engineering, Wuhan University of Technology, Wuhan Hubei 430070, China

Online:2020-02-29 Published:2020-03-11

摘要/Abstract

摘要： 数据聚类的可视分析方法利用可视化与交互技术帮助用户对聚类过程与结果进行
多角度分析，从而发现数据内部隐藏的结构和关系。但由于高维数据自身的“维度诅咒”问题
使得聚类分析面临着许多挑战，例如模型参数设定、数据特征捕捉、结果解释以及可视化展现
等。本文从高维数据聚类过程中遇到的问题出发，首先总结了高维数据聚类过程中常用的数据
处理方法并对其性能进行了比较，这些方法能够较好地解决“维度诅咒”问题，帮助用户挖掘
数据中存在的聚类模式。在分析和理解不同聚类结果中包含的数据内部结构和规律时，由于前
期采取的数据处理方法不同，因此需要采取不同的探索分析策略，所以本文将近10 年来高维数
据聚类的可视分析方法分为2 大类进行总结，即基于降维的聚类可视分析方法和基于子空间聚
类的可视分析方法。最后对该领域目前存在的机遇与挑战进行了讨论。

关键词: 可视分析, 聚类, 高维数据, 综述

Abstract: Visual clustering analysis makes use of visualization and interaction technologies to help
users analyze the clustering process and results from multiple perspectives to find hidden structures
and relationships within the original data. However, because of the “curse of dimension” of
high-dimensional data, there are many challenges posed for cluster analysis, such as parameter setting
of clustering model, data feature capture, result interpretation and visualization. Starting with the
problems encountered in the process of high-dimensional data clustering, this paper firstly
summarizes the data processing methods commonly used in the process of clustering and compares
their performance. These methods can greatly solve the “curse of dimension” problem to help users
explore the clustering patterns existing in the data. Then, due to the different needs of the clustering
results obtained by different data processing methods in analyzing and understanding the internal
structure and rules hidden in clusters, this paper makes a summary and divides the currently available
visual analysis approaches of clustering high-dimensional data into two categories, namely, visual
analysis approaches based on dimensionality reduction and subspace clustering. Finally, the current opportunities and challenges existing in this field are discussed.

Key words: visual analysis, clustering, high-dimensional data, overviewing

章蓉 1，陈谊 1，张梦录 1，孟可欣 2. 高维数据聚类可视分析方法综述[J]. 图学学报, DOI: 10.11996/JG.j.2095-302X.2020010044.

ZHANG Rong1, CHEN Yi1, ZHANG Meng-lu1, MENG Ke-xin2. Overviewing of visual analysis approaches for clustering high-dimensional data[J]. Journal of Graphics, DOI: 10.11996/JG.j.2095-302X.2020010044.

[1]	李忠伟, 徐斌, 李永, 宫凯旋, 刘格格. 基于非结构化三角网格的海洋流场可视化[J]. 图学学报, 2022, 43(3): 486-495.
[2]	姜莱, 于震, 王鹏飞, 周东生, 侯亚庆 . 音频驱动跨模态视觉生成算法综述[J]. 图学学报, 2022, 43(2): 181-188.
[3]	李妮妮, 王夏黎, 付阳阳, 郑凤仙, 何丹丹, 袁绍欣. 一种优化 YOLO 模型的交通警察目标检测方法[J]. 图学学报, 2022, 43(2): 296-305.
[4]	马小东, 任芃锟, 赵凡. 起止点数据可视分析研究[J]. 图学学报, 2022, 43(1): 1-10.
[5]	蔡敏敏, 黄继风, 林晓, 周小平. 基于人体姿态估计与聚类的特定运动帧获取方法[J]. 图学学报, 2022, 43(1): 44-52.
[6]	张豪远, 徐丹, 罗海妮, 杨冰. 基于边缘重建的多尺度壁画修复方法[J]. 图学学报, 2021, 42(4): 590-598.
[7]	刘丽艳 , 张宏鑫 , 陈为 , 邸奕宁 , 刘嘉信 , 满家巨 . 可视分析增强的平行智能交通系统框架[J]. 图学学报, 2021, 42(3): 485-491.
[8]	王春香, 刘流, 周国勇, 纪康辉. 面向自动修补的圆柱特征孔洞识别[J]. 图学学报, 2021, 42(3): 511-516.
[9]	罗国亮, 王贺, 赵昕, 曹义亲, 黄晓生, 邬昌兴, 冼楚华 . 基于数据结构化的三维动画压缩方法研究[J]. 图学学报, 2021, 42(2): 182-189.
[10]	冯洁 , 李博 , 周秉锋 , . 基于像素聚类的空间变化表面材质建模[J]. 图学学报, 2021, 42(1): 94-100.
[11]	唐科威, 穆梦娇, 李缙红, 张杰, 姜伟, 彭兴璇. 基于快速凸无穷范数极小化的大量子空间的子空间分割[J]. 图学学报, 2020, 41(6): 954-961.
[12]	陶桂林1，马文玉 1，唐克强 2，杜奕呈 3. BIM 正向设计存在的问题和思考[J]. 图学学报, 2020, 41(4): 614-623.
[13]	张慧军,陈俊杰 . 利用问题求解理论来研究交互式复杂信息的可视分析行为[J]. 图学学报, 2020, 41(3): 325-334.
[14]	李文生，原达，苗翠，王冬雨. 基于多标签层次聚类的GPR 图像双曲波提取方法[J]. 图学学报, 2020, 41(3): 399-408.
[15]	王万齐 1，马宝睿 2，李倩 2，卢文龙 1，刘玉身 2 . 基于属性相似性度量的 BIM 构件聚类[J]. 图学学报, 2020, 41(2): 304-312.

高维数据聚类可视分析方法综述

Overviewing of visual analysis approaches for clustering high-dimensional data

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价