JCSE, vol. 3, no. 1, pp.1-14, 2009
DOI:
Improving Lookup Time Complexity of Compressed Suffix Arrays using the Multi-ary Wavelet Tree
Zheng Wu Joong Chae Na Minhwan Kim Dong Kyue Kim
Department of Computer Science and Engineering, Pusan National University, Korea|Department of Computer Science and Engineering, Sejong University, Korea|Department of Computer Science and Engineering, Pusan National University, Korea|Department of Electro
Abstract: In a given text T of size n, we need to search for the information that we are interested. In orderto support fast searching, an index must be constructed by preprocessing the text. Suffix arrayis a kind of index data structure. The compressed suffix array (CSA) is one of the compressedindices based on the regularity of the suffix array, and can be compressed to the kth orderempirical entropy. In this paper we improve the lookup time complexity of the compressed suffixarray by using the multi-ary wavelet tree at the cost of more space. In our implementation, thelookup time complexity of the compressed suffix array is O(n logr σ), and the space ofthe compressed suffix array is nHk(T )+O(n log log n/n) bits, where σ is the size ofalphabet, Hk is the kth order empirical entropy, r is the branching fac
Keyword:
No keyword
Full Paper: 162 Downloads, 4192 View
|