Abstract:As an emerging technique in software engineering, automatic source code summarization aims to generate natural language descriptions for given code snippets. State-of-the-art code summarization techniques utilize encoder-decoder neural models; the encoder extracts the semantic representations of the source code, while the decoder translates them into human-readable code summary. However, many existing approaches treat input code snippets as standalone functions, often overlooking the context dependencies between the target function and its invoked subfunctions. Ignoring these dependencies can result in the omission of crucial semantic information, potentially reducing the quality of the generated summary. To this end, in this paper, we introduce DHCS, a dependency-aware hierarchical code summarization neural model. DHCS is designed to improve code summarization by explicitly modeling the hierarchical dependencies between the target function and its subfunctions. Our approach employs a hierarchical encoder consisting of both a subfunction encoder and a target function encoder, allowing us to capture both local and contextual semantic representations effectively. Meanwhile, we introduce a self-supervised task, namely the masked subfunction prediction, to enhance the representation learning of subfunctions. Furthermore, we propose to mine the topic distribution of subfunctions and incorporate them into a summary decoder with a topic-aware copy mechanism. Therefore, it enables the direct extraction of key information from subfunctions, facilitating more effective summary generation for the target function. Finally, we have conducted extensive experiments on three real-world datasets constructed for Python, Java and Go languages, which clearly validate the effectiveness of our approach.