Abstract:With the development of information technology, the interaction between information networks, human society, and physical space deepens, and the phenomenon of information space risk overflow becomes more severe. Fraudulent incidents have sharply increased, making fraud detection an important research field. Fraudulent behavior has brought numerous negative impacts to society, gradually presenting emerging characteristics such as intelligence, industrialization, and high concealment. Traditional expert rules and deep graph neural network algorithms are becoming increasingly limited in addressing fraudulent activities. Current fraud detection methods often rely on local information from the nodes themselves and neighboring nodes, either focusing on individual users, analyzing the relationship between nodes and graph topology, or utilizing graph embedding technology to learn node representations. Although these approaches offer certain fraud detection capabilities, they overlook the crucial role of long-range association patterns of entities and fail to explore common patterns among massive fraudulent paths, limiting comprehensive fraud detection capabilities. In response to the limitations of existing fraud detection methods, this study proposes a graph fraud detection model called path aggregation graph neural network (PA-GNN), based on path aggregation. The model includes variable-length path sampling, position-related unified path encoding, path interaction and aggregation, and aggregation-related fraud detection. Several paths originating from a node interact globally and compare their similarities, extracting common patterns among fraudulent paths, thus more comprehensively revealing the association patterns between fraudulent behaviors, and achieving fraud detection through path aggregation. Experimental results across multiple datasets in fraud scenarios, including financial transactions, social networks, and review networks, show that the area under the curve (AUC) and average precision (AP) metrics of the proposed method have significantly improved compared to the optimal benchmark models. In addition, the proposed method uncovers potential common fraudulent path patterns for fraud detection tasks, driving nodes to learn these important patterns and obtain more expressive representations, which offers a certain level of interpretability.