Abstract:In the field of software engineering, code repositories contain a wealth of knowledge resources, which can provide developers with examples of programming practices. If repetitive patterns, frequently occurring in source code, can be effectively extracted in the form of code templates, programming efficiency could be significantly improved. In current practice, developers often reuse existing solutions by searching through source code. However, this method typically generates a large number of similar and redundant results, increasing the burden of subsequent filtering. Moreover, template mining techniques based on cloned code often fail to cover extensive patterns constructed from dispersed small clones, thereby limiting the practicality of the templates. A method is proposed for extracting and retrieving code templates based on code clone detection. This method achieves more efficient function-level code template extraction by stitching together multiple fragment-level clones and extracting and aggregating the shared parts of method-level clones and addresses the issue of template quality. Based on the mined code templates, this study comes up with a triplet representation method for code structural features that effectively supplements plain text features, and implements an efficient and concise structural representation. In addition, this study presents a template feature retrieval method that combines structural and textual search to retrieve these templates by matching features of the programming context. The tool implemented based on this method, CodeSculptor, demonstrates its significant capability to extract high-quality code templates in a test against a codebase containing 45 high-quality Java open-source projects. The results show that the templates mined by the tool achieve an average code reduction of 60.87%, with 92.09% produced by stitching fragment-level clones, a proportion of templates that is not identifiable by traditional method. It proves the superior performance of the method in recognizing and constructing code templates. Furthermore, the accuracy of the top-5 search results in the code template search and recommendation is 96.87%. A preliminary case study on 9600 randomly selected templates reveals that most of the sampled code templates are complete and coherent in semantics, thus affirming their practicality. Nonetheless, there are a few meaningless templates, highlighting the future potential to refine the proposed template extraction strategy. The user research further shows that code development tasks can be done more efficiently with CodeSculptor.