Journal of Software:2017.28(4):925-939

(数学工程与先进计算国家重点实验室(解放军信息工程大学), 河南 郑州 450000;防空兵指挥学院, 河南 郑州 450000)
Loop Vectorization Method Guided by SIMD Parallelism
GAO Wei,HAN Lin,ZHAO Rong-Cai,XU Jin-Long,CHEN Chao-Ran
(State Key Laboratory of Mathematical Engineering and Advanced Computing(PLA Information Engineering University), Zhengzhou 450000, China;PLA Air Defense Forces Command College, Zhengzhou 450000, China)
Chart / table
Similar Articles
Article :Browse 2136   Download 1554
Received:April 12, 2015    Revised:July 31, 2015
> 中文摘要: SIMD扩展部件是集成到通用处理器中的加速部件,旨在发掘多媒体和科学计算等领域程序的数据级并行.当前,两种基本的向量发掘方法分别是发掘迭代间并行的Loop-based方法和发掘迭代内并行的SLP方法.Loop-aware方法是对SLP方法的改进,其思想是:首先,通过循环展开将迭代间并行转换为迭代内并行,使循环体内的同构语句条数足够多;再利用SLP方法进行向量发掘.但当循环展开不合法或者并行度低于向量化因子时,Loop-aware方法无法实现程序向量并行性的发掘.因此提出了向量并行度指导的循环向量化方法,依据迭代间并行度、迭代内并行度和向量化因子构建循环向量化方法选择方案,同时提出了不充分向量化方法发掘并行度低于向量化因子的循环向量并行性,最后,依据向量并行度对生成的向量循环进行展开.经过标准测试集测试,向量并行度指导的循环SIMD向量化方法比Loop-aware方法的识别率提升了107.5%,性能提升了12.1%.
Abstract:SIMD extension is an acceleration component integrated into the general processor, aiming at exploiting data level parallelism in multimedia and scientific computation programs. Two of the mainstream vectorization methods are loop-based method oriented to inter-iteration and SLP method oriented to intra-iteration. Derived from SLP, loop-aware method transforms inter-iteration to intra-iteration through loop unrolling, so as to obtain enough isomorphic statements and then uses SLP to explore vectorization. However, when loop unrolling is illegal or SIMD parallelism is lower than the vector factor, loop-aware method cannot exploit SIMD parallelism of programs. To address this drawback, a vectorization method guided by SIMD parallelism for loops is proposed. Alternative scheme for loop vectorization is constructed in view of inter-iteration parallelism, intra-iteration parallelism and vector factor. Simultaneously, insufficient vectorization is proposed to vectorize loops whose parallelism is lower than the vector factor. Lastly, vectorized loop is unrolled according to SIMD parallelism. Test results by benchmarks show that vectorization method guided by SIMD parallelism outperforms loop-aware method by 107.5%. Moreover, the performance is improved by 12.1% compared with loop-aware method.
文章编号:     中图分类号:    文献标志码:
基金项目:“核高基”国家科技重大专项(2009ZX01036) “核高基”国家科技重大专项(2009ZX01036)
Foundation items:CHB National Major Science and Technology Project Foundation of China under Grant (2009ZX01036)
Reference text:


GAO Wei,HAN Lin,ZHAO Rong-Cai,XU Jin-Long,CHEN Chao-Ran.Loop Vectorization Method Guided by SIMD Parallelism.Journal of Software,2017,28(4):925-939