国家重点研发计划(2020AAA0106100); 国家自然科学基金(61876206); 智能信息处理山西省重点实验室开放课题(CICIP2020003)
稳定学习的目标是利用单一的训练数据构造一个鲁棒的预测模型, 使其可以对任意与训练数据具有相似分布的测试数据进行精准的分类. 为了在未知分布的测试数据上实现精准预测, 已有的稳定学习算法致力于去除特征与类标签之间的虚假相关关系. 然而, 这些算法只能削弱特征与类标签之间部分虚假相关关系并不能完全消除虚假相关关系; 此外, 这些算法在构建预测模型时可能导致过拟合问题. 为此, 提出一种基于实例加权和双分类器的稳定学习算法, 所提算法通过联合优化实例权重和双分类器来学习一个鲁棒的预测模型. 具体而言, 所提算法从全局角度平衡混杂因子对实例进行加权来去除特征与类标签之间的虚假相关关系, 从而更好地评估每个特征对分类的作用. 为了完全消除数据中部分不相关特征与类标签之间的虚假相关关系以及弱化不相关特征对实例加权过程的干扰, 所提算法在实例加权之前先进行特征选择筛除部分不相关特征. 为了进一步提高模型的泛化能力, 所提算法在训练预测模型时构建两个分类器, 通过最小化两个分类器的参数差异来学习一个较优的分类界面. 在合成数据集和真实数据集上的实验结果表明了所提方法的有效性.
Stable learning aims to leverage the knowledge obtained only from a single training data to learn a robust prediction model for accurately predicting label of the test data from a different but related distribution. To achieve promising performance on the test data with agnostic distributions, existing stable learning algorithms focus on eliminating the spurious correlations between the features and the class variable. However, these algorithms can only weaken part of the spurious correlations between the features and the class variable, but can not completely eliminate the spurious correlations. Furthermore, these algorithms may encounter the overfitting problem in learning the prediction model. To tackle these issues, this study proposes a sample reweighting and dual classifiers based stable learning algorithm, which jointly optimizes the weights of samples and the parameters of dual classifiers to learn a robust prediction model. Specifically, to estimate the effects of all features on classification, the proposed algorithm balances the distribution of confunders by learning global sample weights to remove the spurious correlations between the features and the class variable. In order to eliminate the spurious correlations between some irrelevant features and the class variable and weaken the influence of irrelevant features on the weighting process of samples, the proposed algorithm selects and removes some irrelevant features before sample reweighting. To further improve the generalization ability of the model, the algorithm constructs two classifiers and learns a prediction model with an optimal hyperplane by minimizing the parameter difference between the two classifiers during learning the prediction model. Using synthetic and real-world datasets, the experiments have validated the effectiveness of the proposed algorithm.