###
Journal of Software:2020.31(6):1875-1888

基于分片复用的多版本容器镜像加载方法
陆志刚,徐继伟,黄涛
(中国科学院 软件研究所 软件工程技术中心, 北京 100190;计算机科学国家重点实验室(中国科学院 软件研究所), 北京 100190;中国科学院大学, 北京 100190)
Container Image Deduplication Method Based on Chunking Reuse of Multi-versions
LU Zhi-Gang,XU Ji-Wei,HUANG Tao
(Technology Center of Software Engineering, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;State Key Laboratory of Computer Science (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100190, China)
Abstract
Chart / table
Reference
Similar Articles
Article :Browse 134   Download 183
Received:September 04, 2017    Revised:September 08, 2018
> 中文摘要: 容器将应用和支持软件、库文件等封装为镜像,通过发布新版本镜像实现应用升级,导致不同版本之间存在大量相同数据.镜像加载消耗大量时间,使容器启动时间从毫秒级延迟为秒级甚至是分钟级.复用不同版本之间的相同数据,有利于减少容器加载时间.当前,容器镜像采用继承和分层加载机制,有效实现了支持软件、库文件等数据的复用,但对于应用内部数据还没有一种可靠的复用机制.提出一种基于分片复用的多版本容器镜像加载方法,通过复用不同版本镜像之间的相同数据,提升镜像加载效率.方法的核心思想是:利用边界匹配数据块切分方法将容器镜像切分为细粒度数据块,将数据块哈希值作为唯一标识指纹,借助B-树搜索重复指纹判断重复数据块,减少数据传输.实验结果表明,该方法可以提高5.8X以上容器镜像加载速度.
Abstract:Container encapsulates the application, the supporting software, and the operating system libraries as an image. The application is updated through publishing a newer image version. That would lead a certain degree of duplications between the neighboring versions. The loading process of container image is time-consuming and delays the starting time of a container from milliseconds to seconds or minutes. Reusing the same data of previous versions can help to reduce the loading time. The layered loading and inheritance features adopted by container can help to reuse the supporting software and the operating system libraries effectively in image loading. However, reusing the application data is currently not supported. This study proposed a container image loading methodology based on chunking reuse of older versions to improve the image loading performance. A boundary matching based chunking method was used to divide the image layers into fine-gained data chunk, the chunk hash value was used as the unique identification fingerprint. The B-tree was used to find the same blocks and the same blocks were reused to speed up the loading process. Experimental results show that the proposed method can improve 5.8X container image loading speed.
文章编号:     中图分类号:TP316    文献标志码:
基金项目:国家重点研发计划(2017YFC0804407);国家自然科学基金(61602454,61872344);北京市自然科学基金(4182070) 国家重点研发计划(2017YFC0804407);国家自然科学基金(61602454,61872344);北京市自然科学基金(4182070)
Foundation items:National Key Research and Development Program of China (2017YFC0804407); National Natural Science Foundation of China (61602454); Beijing Nature Science Foundation (4182070)
Reference text:

陆志刚,徐继伟,黄涛.基于分片复用的多版本容器镜像加载方法.软件学报,2020,31(6):1875-1888

LU Zhi-Gang,XU Ji-Wei,HUANG Tao.Container Image Deduplication Method Based on Chunking Reuse of Multi-versions.Journal of Software,2020,31(6):1875-1888