Knowledge Distillation for Scene Text Detection via Mask Information Entropy Transfer
Author:
Affiliation:

Clc Number:

TP18

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Mainstream methods for scene text detection often use complex networks with plenty of layers to improve detection accuracy, which requires high computational costs and large storage space, thus making them difficult to deploy on embedded devices with limited computing resources. Knowledge distillation assists in training lightweight student networks by introducing soft target information related to teacher networks, thus achieving model compression. However, existing knowledge distillation methods are mostly designed for image classification and extract the soft probability distributions from teacher networks as knowledge. The amount of information carried by such methods is highly correlated with the number of categories, resulting in insufficient information when directly applied to the binary classification task in text detection. To address the problem of scene text detection, this study introduces a novel concept of information entropy and proposes a knowledge distillation method based on mask entropy transfer (MaskET). MaskET combines information entropy with traditional knowledge distillation methods to increase the amount of information transferred to student networks. Moreover, to eliminate the interference of background information in images, MaskET only extracts the knowledge within the text area by adding mask operations. Experiments conducted on six public benchmark datasets, namely ICDAR 2013, ICDAR 2015, TD500, TD-TR, Total-Text and CASIA-10K, show that MaskET outperforms the baseline model and other knowledge distillation methods. For example, MaskET improves the F1 score of MobileNetV3-based DBNet from 65.3% to 67.2% on the CASIA-10K dataset.

    Reference
    Related
    Cited by
Get Citation

陈建炜,沈英龙,杨帆,赖永炫.基于掩码信息熵迁移的场景文本检测知识蒸馏.软件学报,2025,36(9):4188-4207

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:March 28,2023
  • Revised:January 11,2024
  • Adopted:
  • Online: January 15,2025
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063