Call for Papers
About the Journal
Editorial Board
Publication Ethics
Instructions for Authors
Current Issue
Back Issues
Search for Articles
Back Issues

JCSE, vol. 10, no. 4, pp.128-136, December, 2016


Company Name Discrimination in Tweets using Topic Signatures Extracted from News Corpus

Beomseok Hong, Yanggon Kim, and Sang Ho Lee
Department of Computer and Information Science, Towson University, Towson, MD, USA School of Software, Soongsil University, Seoul, Korea

Abstract: It is impossible for any human being to analyze the more than 500 million tweets that are generated per day. Lexical ambiguities on Twitter make it difficult to retrieve the desired data and relevant topics. Most of the solutions for the word sense disambiguation problem rely on knowledge base systems. Unfortunately, it is expensive and time-consuming to manually create a knowledge base system, resulting in a knowledge acquisition bottleneck. To solve the knowledgeacquisition bottleneck, a topic signature is used to disambiguate words. In this paper, we evaluate the effectiveness of various features of newspapers on the topic signature extraction for word sense discrimination in tweets. Based on our results, topic signatures obtained from a snippet feature exhibit higher accuracy in discriminating company names than those from the article body. We conclude that topic signatures extracted from news articles improve the accuracy of word sense discrimination in the automated analysis of tweets.

Keyword: Twitter; Tweet; Word sense discrimination; Topic signature

Full Paper:   327 Downloads, 1146 View

ⓒ Copyright 2010 KIISE – All Rights Reserved.    
Korean Institute of Information Scientists and Engineers (KIISE)   #401 Meorijae Bldg., 984-1 Bangbae 3-dong, Seo-cho-gu, Seoul 137-849, Korea
Phone: +82-2-588-9240    Fax: +82-2-521-1352    Homepage:    Email: