Abstract:HTAP databases are capable of simultaneously supporting OLTP and OLAP workloads within a set of systems. The workload identification is a critical entry point for routing distribution in query execution. The only way to reasonably optimize the queries and allocate resources is to accurately identify whether a query belongs to OLTP or OLAP. Therefore, accurate identification of workload types is a key factor in the performance of HTAP databases. However, existing workload identification methods are mainly based on rules and cost-based measures in SQL statements, as well as machine learning approaches to differentiate workloads. These methods do not consider the inherent characteristics of query statements and utilize structural information in execution plans, resulting in low workload identification accuracy. To improve workload identification accuracy, this study proposes an intelligent method for identifying OLTP and OLAP workloads. This method extracts and encodes features from SQL statements and execution plans, builds the SQL statement encoder based on BERT, and combines the convolutional neural networks and attention mechanisms to construct the encoder of execution plans, with two types of features integrated to build a classifier. The model enables intelligent identification of workloads in HTAP hybrid workloads. Experimental verification shows that the proposed model can accurately identify OLTP and OLAP workloads with high identification accuracy. Additionally, the robustness of the model has been validated across multiple datasets, and the model is integrated into the TiDB database to verify its performance improvement on the database.