关键词:
online text indexing
packed string matching
sparse suffix trees
dynamic predecessor dictionaries
摘要:
We present a new data structure called the packed compact trie (packed c-trie) which stores a set S of k strings of total length n in n log sigma + (k log n) bits of space and supports fast pattern matching queries and updates, where sigma is the alphabet size. Assume that alpha = log(sigma). n letters are packed in a single machine word on the standard word RAM model, and let f (k, n) denote the query and update times of the dynamic predecessor/successor data structure of our choice which stores k integers from universe [1, n] in 0 (k log n) bits of space. Then, given a string of length m, our packed c-tries support pattern matching queries and insert/delete operations in O(m/alpha f (k, n)) worst-case time and in O (m/alpha + f (k, n)) expected time. Our experiments show that our packed c-tries are faster than the standard compact tries (a.k.a. Patricia trees) on real data sets. We also discuss applications of our packed c-tries.