| P.O.Box 8718, Beijing 100080, China | Journal of Software, March 2007,18(3):565-573 |
| E-mail: jos@iscas.ac.cn | ISSN 1000-9825, CODEN RUXUEW, CN 11-2560/TP |
| http://www.jos.org.cn | Copyright © 2007 by Journal of Software |
基于最大熵分类器的语义角色标注
刘 挺, 车万翔, 李 生
Abstract
Semantic role labeling is a feasible proposal to shallow semantic parsing. A maximum entropy classifier is used in the semantic role labeling system, which takes syntactic constituents as the labeled units. The maximum entropy classifier is trained to identify and classify the predicates' semantic roles at the same time. Some useful features and their combinations are used in the classifier. In the post-processing step, only the roles with the highest probability among the embedding ones are kept. After predicting all the arguments, which have matched the constituents in full parsing trees, a simple rule-based post-processing is applied to correct the arguments that have not matched the constituents in these trees. F1=75.49% and F1=75.60% results are obtained on the development and test set respectively. So far as it is known, this is the best result based on single syntactic parser in literatures. Finally, some proposals for solving the difficulties in semantic role labeling and the future works are given.
Liu T, Che WX, Li S. Semantic role labeling with maximum entropy classifier.
Journal of Software, 2007, 18(3):565-573.
DOI:
10.1360/jos180565
http://www.jos.org.cn/1000-9825/18/565.htm
摘要
语义角色标注是浅层语义分析的一种可行方案.描述了一个采用最大熵分类器的语义角色标注系统,该系统把句法成分作为语义标注的基本单元,用最大熵分类器对句子中谓词的语义角色同时进行识别和分类.最大熵分类器中使用了一些有用的特征及其组合.在后处理阶段,在具有嵌套关系的结果中,只有概率最高的语义角色被保留.在预测了全部能够在句法分析树中找到匹配成分的角色以后,采用简单的后处理规则去识别那些找不到匹配成分的角色.最终在开发集和测试集上分别获得了75.49%和75.60%的F1值,此结果是已知的基于单一句法分析结果中最好的.最后提出了对该任务的一些难点问题的解决方案以及对语义角色标注发展的一个初步展望.
基金项目:Supported by the National Natural Science Foundation of China under Grant Nos.60575042, 60503072, 60675034 (国家自然科学基金)
References:
[1] Gildea D, Jurafsky D. Automatic labeling of semantic roles. Computational Linguistics, 2002,28(3):245-288.
[2] Baker CF, Fillmore CJ, Lowe JB. The Berkeley FrameNet project. In: Boitet C, Whitelock P, eds. Proc. of the ACL&Coling'98. Montreal: ACL, 1998. 86-90.
[3] Palmer M, Gildea D, Kingsbury P. The Proposition bank: An annotated corpus of semantic roles. Computational Linguistics, 2005, 31(1):71-106.
[4] Erk K, Kowalski A, Pado S, Pinkal M. Towards a resource for lexical semantics: A large german corpus with extensive semantic annotation. In: Hinrichs EW, Roth D, eds. Proc. of the ACL 2003. Sapporo: ACL, 2003. 537-544.
[5] Chen J, Rambow O. Use of deep linguistic features for the recognition and labeling of semantic arguments. In: Hinrichs EW, Roth D, eds. Proc. of the EMNLP 2003. Sapporo: ACL, 2003. 41-48.
[6] Nielsen RD, Pradhan S. Mixing weak learners in semantic parsing. In: Lin D, Wu D, eds. Proc. of the EMNLP 2004. Barcelona: ACL, 2004. 80-87.
[7] Pradhan S, Hacioglu K, Krugler V, Ward W, Martin JH, Jurafsky D. Support vector learning for semantic argument classification. Machine Learning Journal, 2005,60(3):11-39.
[8] Carreras X, Màrques L, Chrupala G. Hierarchical recognition of propositional arguments with perceptrons. In: Ng HT, Riloff E, eds. Proc. of the CoNLL 2004. Boston: ACL, 2004.106-109.
[9] Punyakanok V, Koomen P, Roth D, Yih W. Generalized inference with multiple semantic role labeling systems. In: Knight K, Ng HT, Oflazer K, eds. Proc. of the CoNLL 2005. Ann Arbor: ACL, 2005. 181-184.
[10] Hacioglu K, Pradhan S, Ward W, Martin JH, Jurafsky D. Semantic role labeling by tagging syntactic chunks. In: Ng HT, Riloff E, eds. Proc. of the CoNLL 2004. Boston: ACL, 2004. 110-113.
[11] Hacioglu K. Semantic role labeling using dependency trees. In: Nirenburg S, ed. Proc. of the Coling 2004. Geneva: COLING. 2004. 1273-1276.
[12] Carreras X, Màrques L. Introduction to the CoNLL-2004 Shared Task: Semantic role labeling. In: Ng HT, Riloff E, eds. Proc. of the CoNLL 2004. Boston: ACL, 2004. 89-97.
[13] Carreras X, Màrques L. Introduction to the CoNLL-2005 Shared Task: Semantic role labeling. In: Knight K, Ng HT, Oflazer K, eds. Proc. of the CoNLL 2005. Ann Arbor: ACL, 2005. 152-164.
[14] Collins M. Head-Driven statistical models for natural language parsing [Ph.D. Thesis]. Pennsylvania University, 1999.
[15] Charniak E. A maximum-entropy-inspired parser. In: Nirenburg S, ed. Proc. of the NAACL 2000. Washington: ACL, 2000. 132-139.
[16] Chieu HL, Ng WT. Named entity recognition with a maximum entropy approach. In: Daelemans W, Osborne M, eds. Proc. of the CoNLL 2003. Edmonton: ACL, 2003. 160-163.
[17] Berger AL, Della Pietra SA, Della Pietra VJ. A maximum entropy approach to natural language processing. Computational Linguistics, 1996,22(1):39-71.
[18] Chen SF, Rosenfeld R. A Gaussian prior for smoothing maximum entropy models. Technical Report, CMU-CS-99-108, 1999.
[19] Porter M. An algorithm for suffix stripping. Program, 1980,14(3):130-137.
[20] Surdeanu M, Turmo J, Semantic role labeling using complete syntactic analysis. In: Knight K, Ng HT, Oflazer K, eds. Proc. of the CoNLL 2005. Ann Arbor: ACL, 2005. 221-224.
[21] Collins M, Duffy N. New ranking algorithms for parsing and tagging: Kernels over discrete structures, and the voted perceptron. In: Proc. of the ACL 2002. Philadelphia: ACL, 2002. 263-270.
[22] Xue N, Palmer M. Automatic semantic role labeling for Chinese verbs. In: Charniak E, Lin DK, Kaelbling LP, Saffiotti A, eds. Proc. of the IJCAI 2005. Edinburgh: Professional Book Center, 2005.
1160-1165.