Computing all Repeats Using Suffix Arrays | Zendy

Frantisek  Franek | Zendy; William F. Smyth | Zendy; Yudong  Tang | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Computing all Repeats Using Suffix Arrays

Author(s) -

Frantisek Franek,

William F. Smyth,

Yudong Tang

Publication year - 2003

Publication title -

j. autom. lang. comb.

Language(s) - English

DOI - 10.25596/jalc-2003-579

We describe an algorithm that identifies all the repeating substrings (tandem, overlapping, and split) in a given string x = X[1..n]. Given the suffix arrays of x and of the reversed string x, the algorithm requires Θ(n) time for its execution and represents its output in Θ(n) space, either as a reduced suffix array (called an NE array) or as a reduced suffix tree (called an NE tree). The output substrings u are nonextendible (NE); that is, any extension of some occurrence of u in x, either to the left or to the right, yields a string (λu or uλ) that is unequal to the same extension of some other occurrence of u. Thus the number of substrings output is the minimum required to identify all the repeating substrings in x. The output can be used in a straightforward way to identify only repeating substrings that satisfy some proximity or minimum length condition.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research