###
DOI:
:2014.25(S2):90-100

生物基因测序高性能计算日志的任务分析及建模
曹志波,董守斌,王丙强,左利云
(华南理工大学 计算机科学与工程学院 广东省计算机网络重点实验室, 广东 广州 510006;深圳华大基因研究院, 广东 深圳 518083)
Workload Analysis and Modeling of High Performance Computing Trace of Biological Gene Sequencing
CAO Zhi-Bo,DONG Shou-Bin,WANG Bing-Qiang,ZUO Li-Yun
(Key Laboratory of Communication and Computer Network, School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China;Shenzhen Huada Gene Research Institute, Shenzhen 518083, China)
Abstract
Chart / table
Reference
Similar Articles
Article :Browse 1066   Download 1600
Received:August 05, 2013    Revised:March 13, 2014
> 中文摘要: 生物基因测序是生物信息学分析中最常用的高性能计算任务.旨在通过分析生物基因测序日志找出生物基因测序日志中的任务特性,构建一种通用的适合分析生物基因测序的任务模型,并应用于面向基因测序的高性能计算系统的任务调度及性能优化.基于任务日志,主要分析了生物基因测序日志中任务到达时间的规律特性、任务运行时间和任务的并行尺寸等特性,通过这些任务特性利用指数分布、伽马分布、正态分布以及线性拟合构建了相应的局部任务模型,然后提出一种局部模型融合的方法,将各个局部模型合并为统一的任务模型.通过两种通用的模型评测方法对任务模型进行的评测结果显示,最终的任务模型与原有任务日志的4种任务属性趋于相同的分布,验证了所构建的任务模型具有很好的通用性.
Abstract:Biological gene sequencing is one of the most common high-performance computing tasks in Bioinformatics analysis. This paper aims to find the main workload characteristics of biological gene sequence trace (BGST) and construct a general model to analyze the biological gene sequence (BGS), which can be used in high-performance computing scheduling and performance optimization with the BGS. The study mainly analyzes the job arrival, runtime and parallelism characteristics in BGST. Based on the analysis, it constructs several local models with exponential, Gamma, Gaussian and linear regression, then combines all the local models into a final model. The experimental results obtained by applying two general evaluation methods show that the new model has uniform distributed trend with BGST, which demonstrates the good versatility of the model.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(61070092);广州市科技计划(2012Y2-00043,2013Y2-00041) 国家自然科学基金(61070092);广州市科技计划(2012Y2-00043,2013Y2-00041)
Foundation items:
Reference text:

曹志波,董守斌,王丙强,左利云.生物基因测序高性能计算日志的任务分析及建模.软件学报,2014,25(S2):90-100

CAO Zhi-Bo,DONG Shou-Bin,WANG Bing-Qiang,ZUO Li-Yun.Workload Analysis and Modeling of High Performance Computing Trace of Biological Gene Sequencing.Journal of Software,2014,25(S2):90-100