###
Journal of Software:2016.27(7):1700-1714

面向海量低质手机轨迹数据的重要位置发现
章志刚,金澈清,王晓玲,周傲英
(华东师范大学 计算机科学与软件工程学院 数据科学与工程研究院, 上海 200062)
Discovering Important Locations From Massive and Low-Quality Cell Phone Trajectory Data
ZHANG Zhi-Gang,JIN Che-Qing,WANG Xiao-Ling,ZHOU Ao-Ying
(Institute for Data Science and Engineering, School of Computer Science and Software Engineering, East China Normal University, Shanghai 200062, China)
Abstract
Chart / table
Reference
Similar Articles
Article :Browse 3611   Download 3282
Received:September 25, 2015    Revised:January 12, 2016
> 中文摘要: 重要位置是指人们在日常生活中的主要活动地点,比如居住地和工作地.智能手机的不断发展与普及为人们的日常生活带来了极大的便利.除了通话、上网等传统应用之外,手机连接基站自动生成的日志记录也是用于用户行为模式挖掘的重要数据来源,例如重要位置发现.然而,相关工作面临着诸多挑战,包括轨迹数据规模庞大、位置精度低以及手机用户的多样性.为此,提出了一个通用解决框架以提高轨迹数据可用性.该框架包含一个基于状态的过滤模块,提高了数据的可用性,以及一个重要位置挖掘模块.基于此框架设计了两种分布式挖掘算法:GPMA(grid-based parallel mining algorithm)和SPMA(station-based parallel mining algorithm).进一步地,为提高挖掘结果的准确性和精确度,从3个方面进行优化:(1)使用多元数据的融合技术,提高结果的准确性;(2)提出了无工作地人群的发现算法;(3)提出了夜间工作人群的发现算法.理论分析和实验结果表明,所提算法具有较高的执行效率和可扩展性,并具有更高的精度.
中文关键词: 低质  轨迹挖掘  重要位置  数据修正
Abstract:Important locations mainly refer to the places where people spend much time in the daily life, including their home and working places. The development and popularization of smart cell phones bring great convenience to people's daily life. Besides making calls and surfing the Internet, the logs generated when visiting the base stations also contribute to users' pattern mining, such as important location discovery. However, it's challenging to deal with such kind of trajectory data, due to huge volume, data inaccuracy and diversity of cell phone users. In this research, a general framework is proposed to improve the usability of trajectory data. The framework includes a filter to improve data usability and a model to produce the mining results. Two concrete strategies, namely GPMA (grid-based parallel mining algorithm) and SPMA (station-based parallel mining algorithm), can be embedded into this framework separately. Moreover, three optimization techniques are developed for better performance:(1) a data fusion method, (2) an algorithm to find users who have no work places, and (3) an algorithm to find people who work at night and fix their important locations. Theoretical analysis and extensive experimental results on real datasets show that the proposed algorithms are efficient, scalable, and effective.
文章编号:     中图分类号:    文献标志码:
基金项目:国家重点基础研究发展计划(973)(2012CB316203);国家自然科学基金(61370101,U1501252,61532021);上海市教委科研创新重点项目(14ZZ045) 国家重点基础研究发展计划(973)(2012CB316203);国家自然科学基金(61370101,U1501252,61532021);上海市教委科研创新重点项目(14ZZ045)
Foundation items:National Basic Research Program of China (973) (2012CB316203); National Natural Science Foundation of China (61370101, U1501252, 61532021); Innovation Program of Shanghai Municipal Education Commission (14ZZ045)
Reference text:

章志刚,金澈清,王晓玲,周傲英.面向海量低质手机轨迹数据的重要位置发现.软件学报,2016,27(7):1700-1714

ZHANG Zhi-Gang,JIN Che-Qing,WANG Xiao-Ling,ZHOU Ao-Ying.Discovering Important Locations From Massive and Low-Quality Cell Phone Trajectory Data.Journal of Software,2016,27(7):1700-1714