JCSE, vol. 8, no. 3, pp.137-148, 2014
DOI: http://dx.doi.org/10.5626/JCSE.2014.8.3.137
Classifying Articles in Chinese Wikipedia with Fine-Grained Named Entity Type
Jie Zhou, Bicheng Li, and Yongwang Tang
Zhengzhou Information Science and Technology Institute, Zhengzhou, China
Abstract: Named entity classification of Wikipedia articles is a fundamental research area that can be used to automatically build
large-scale corpora of named entity recognition or to support other entity processing, such as entity linking, as auxiliary
tasks. This paper describes a method of classifying named entities in Chinese Wikipedia with fine-grained types. We considered
multi-faceted information in Chinese Wikipedia to construct four feature sets, designed different feature selection
methods for each feature, and fused different features with a vector space using different strategies. Experimental results
show that the explored feature sets and their combination can effectively improve the performance of named entity classification.
Keyword:
Named entity classification; Chinese Wikipedia; Fine-grained; Feature selection; NER corpora
Full Paper: 249 Downloads, 2350 View
|