Weakly Supervised Image Hashing via CLIP-guided Tag Refinement

doi:10.13328/j.cnki.jos.007543

微信小程序

微信服务号

微信订阅号

Home > Archive>Volume 37, Issue 5, 2026 >1936-1949. DOI:10.13328/j.cnki.jos.007543

PDF HTML XML Export Cite reminder

Weakly Supervised Image Hashing via CLIP-guided Tag Refinement
DOI:
                        10.13328/j.cnki.jos.007543
                    
Author:
                        
                        
                    
Affiliation:
Clc Number:TP391
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

In large-scale image retrieval tasks, image hashing typically relies on a large amount of manually annotated data to train deep hashing models. However, the high cost of manual annotation limits its practical application. To alleviate this dependency, existing studies attempt to use texts provided by web users as weak supervision to guide the model in mining semantic information associated with the texts from images. Nevertheless, the inherent noise in user tags often limits model performance. Multimodal pre-trained models such as CLIP exhibit strong image-text alignment capabilities. Inspired by this, this study utilizes CLIP to optimize user tags and proposes a weakly supervised hashing method called CLIP-guided tag refinement hashing (CTRH). The proposed method consists of three key components: a tag replacement module, a tag weighting module, and a tag-balanced loss function. The tag replacement module fine-tunes CLIP to mine potential image-relevant tags. The tag weighting module performs cross-modal global semantic interaction between the optimized text and images to learn discriminative joint representations. To address the imbalance of user tags, a tag-balanced loss is designed, which dynamically reweights hard samples to enhance the model’s representation learning. Experiments on two general datasets, MirFlickr and NUS-WIDE, verify the effectiveness of the proposed method compared to state-of-the-art approaches.

Reference

Cited by

Get Citation

李泽超,金露,王浩骅,唐金辉.基于CLIP引导标签优化的弱监督图像哈希.软件学报,2026,37(5):1936-1949

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:May 26,2025
Revised:July 11,2025
Adopted:
Online: September 23,2025
Published: May 06,2026

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063