Abstract:With the rapid development of deep learning technologies and diffusion models, image and video generation models have displayed powerful capabilities to produce high-quality and diverse results. How to leverage these models for efficient and precise personalized generation has become a current research hotspot. Personalized image generation methods can combine text descriptions with specific concepts or subjects provided by users to enable the creation of customized images and meet the diverse needs of users for personalized visual content. This study reviews personalized image generation methods based on diffusion models, categorizing existing methods from the perspective of the generation target into single-subject driven generation and multi-concept combination generation. The former focuses on generating customized images according to individual subjects, emphasizing the accurate capture and reconstruction of the subjects’ visual features. The latter focuses on merging multiple concepts or subjects into a single image, addressing challenges like semantic alignment across concepts and visual consistency. This study provides a detailed analysis of representative work in personalized generation by combining specific tasks and application scenarios. Additionally, this study compares and summarizes common datasets, evaluation methods of generation models, and performance comparisons between different personalized generation methods. It further discusses the challenges that personalized generation methods face in practical applications and the future development directions, and offers a prospect for the research trends. This study aims to provide comprehensive references for researchers in relevant fields, fostering the development and innovation of personalized generation methods.