Challenge your understanding of core string searching algorithms including Knuth-Morris-Pratt, Rabin-Karp, and Z-Algorithm. Evaluate your grasp of pattern matching techniques, algorithm properties, and key computational concepts relevant for efficient string processing.
When processing the string 'ababc' using the KMP algorithm, what is the correct value of the prefix function pi[4] (zero-based index) after computation?
Explanation: The prefix function pi[4] indicates the length of the longest proper prefix of the substring 'abab' which is also a suffix. In this substring, 'ab' is both a prefix and a suffix, so pi[4] is 2. Option 1 underestimates the length, 0 ignores any occurring prefix-suffix, and 3 overestimates since 'aba' is not a suffix. Thus, 2 is correct for this step.
Which of the following is a primary drawback of the Rabin-Karp algorithm when using basic modular hash functions for string matching?
Explanation: Rabin-Karp can produce false positives (spurious hits) if different substrings have the same hash value, leading to hash collisions. The algorithm does not inherently require a prefix array (as KMP does), nor is it restricted to numeric data. It also uses linear, not quadratic, memory. Therefore, hash collisions are the most significant issue among the options.
For which string algorithm task is the Z-Algorithm particularly efficient compared to naive methods?
Explanation: The Z-Algorithm efficiently computes the length of the longest common prefix between the entire string and each suffix, all in linear time. It is not used for finding all substrings, sorting characters, nor directly for constructing a suffix tree. Those operations require different algorithms or data structures.
What is the main purpose of the failure table (prefix function) in the KMP algorithm when matching a pattern within a text?
Explanation: The failure table lets KMP skip over certain characters in the pattern and text, avoiding redundant checks and enabling the search to proceed efficiently after a mismatch. Repeating the search is inefficient and not how KMP works. Hashing is used in Rabin-Karp, and sorting the pattern is unrelated. Only skipping unnecessary comparisons matches the correct function.
When a mismatch occurs during pattern matching, how does Rabin-Karp typically handle pattern shifts compared to KMP?
Explanation: Rabin-Karp usually shifts the pattern by a single character after checking each substring, while KMP leverages the prefix function to skip ahead by potentially more than one position. Both algorithms do not always shift by the length of the pattern. Rabin-Karp does not use prefix functions, and KMP does not always fully restart after mismatches.