Journal of Software:2015.26(8):1907-1924

(西安交通大学 电子与信息工程学院, 陕西 西安 710049;北京航天自动控制研究所, 北京 100854)
Multi-User Server Program Self-Recovery System
SHI Yi,FENG Yu-Sheng,QI Yong,SUN Wei
(School of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China;Beijing Aerospace Automatic Control Institute, Beijing 100854, China)
Chart / table
Similar Articles
Article :Browse 2619   Download 2442
Received:March 03, 2014    Revised:July 31, 2014
> 中文摘要: 服务器系统最无法忍受的就是因为频繁出错甚至崩溃影响正常用户的运行,因此需要系统具有自恢复能力.目前研究应用较多的自恢复策略即回滚检查点策略,并不适用于多用户服务器程序的恢复.针对多用户服务器程序的特点,设计了一种基于虚拟机的自恢复系统VMSRS(virtual machine monitor-self recovery of service program).VMSRS的基本思想是以虚拟机监控器为恢复主体,充分利用虚拟机作为第三方底层系统以及硬件资源的管理监控者这些特点所带来的优势,严格保证用户数据一致性、数据元数据操作原子性、恢复数据安全隔离性等;同时应用改进的SRS(self recovery of service program)思想,在错误发生时不进行回滚,控制错误不让其影响正常用户,并保证正常用户和服务器可以顺利地向前运行,就像没有错误发生一样;并利用系统本身和VMSRS的清理机来避免回滚.研究工作设计实现了包括抑制错误、请求恢复、监控、存储管理等模块在内的自恢复系统VMSRS,主要针对多用户服务器系统中的内存错误来进行恢复.通过对基本功能、基本性能、整体功能的实验分析表明,VMSRS在不进行回滚、保证性能的前提下,提供了良好的恢复数据安全性以及完善的用户状态数据一致性保证,可以很好地恢复多线程程序,不需要对线程进行任何限制.同时,该研究工作也为在虚拟化技术条件下研究设计自恢复系统进行了很好的实践和探索.
中文关键词: 虚拟化  自恢复  多线程  一致性
Abstract:Long running multi-user server system may encounter frequent errors resulting in running disruptions due to its complexity of program, operating environments and user operations. This poses the need of self-recovery of system. Rollback and checkpoint scheme is a popular self-recovery strategy in current research and application, but has no obvious effects in multi-user system. In this paper, a VMM-based self-recovery system named VMSRS (virtual machine monitor-self recovery of service program) is designed according to the characteristics of multi-user server programs. The main idea of VMSRS is regarding VMM as major component of recovery, taking advantage of VM as independent underlying system and hardware resource monitor, and strictly maintaining the consistency and security of user data and atomicity of data operation. As an improved SRS (self recovery of service program), VMSRS controls errors to avert affecting normal users in case of system crash instead of committing rollback, allowing users and servers to proceed as if no crash happens. Rollback is avoided by taking advantage of self-cleansing mechanism of system and VMSRS. The issues addressed by VMSRS design include crash suppression module, demand driven restoration module, monitor module, and storage management module. The experiment results from analyzing basic function, basic performance and integral function validate that VMSRS can provide favorable security and consistency of user data while guaranteeing performance and committing no rollback. It recovers multi-thread programs excellently with no limit to threads. Meanwhile, this exploratory study also takes part in current research of self-recovery system utilizing virtualization technology.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(61272460); 教育部高等学校博士学科点专项科研基金(20120201110010); 国家高技术研究发展计划(863)(2012A A010904); 西安交通大学基本科研业务费自由探索项目(xjj2014046) 国家自然科学基金(61272460); 教育部高等学校博士学科点专项科研基金(20120201110010); 国家高技术研究发展计划(863)(2012A A010904); 西安交通大学基本科研业务费自由探索项目(xjj2014046)
Foundation items:
Reference text:


SHI Yi,FENG Yu-Sheng,QI Yong,SUN Wei.Multi-User Server Program Self-Recovery System.Journal of Software,2015,26(8):1907-1924