Consider the alphabet sigma = {0, 1} and sigma^*, the set of all bit strings. Al
ID: 3820080 • Letter: C
Question
Consider the alphabet sigma = {0, 1} and sigma^*, the set of all bit strings. Also, consider the set of all (S, T) pairs, where S is a 4-bit substring of a 10-bit string T such that the brute-force string matching algorithm compares the most pairs of bits when finding the first occurrence of S in T. Of all such (S, T) pairs, give an example of S and T for which the KMP string matching algorithms compares the fewest pairs of bits when searching for the first occurrence of S in T. How many pairs of bits will each algorithm compare?Explanation / Answer
Answer:
Brute force string matching:
Brute-force search is a very general problem-solving technique that consists of systematically enumerating all possible candidates for the solution and checking whether each candidate satisfies the problem's statement.
Example:
T = 1011011010
P = 1101
Above given is the 10 bit text string and 4 bit pattern string. Now we will apply brute force algorithm which is:
do
if (text letter == pattern letter)
compare next letter of pattern to next
letter of text
else
move pattern down text by one letter
while (entire pattern found or end of text)
Now,
1. 1011011010
1101
2. 1011011010
1101
3. 1011011010
1101
As we can see at i=3, that the text matches the string pattern so we will move forward like this and note down the position at which they are equal.
4. 1011011010
1101
5. 1011011010
1101
6. 1011011010
1101
As we can see at i=6, that the text matches the string pattern so we will move forward like this and note down the position at which they are equal and after that they will never be same.
Knuth–Morris–Pratt:
The Knuth–Morris–Pratt string searching algorithm (or KMP algorithm) searches for occurrences of a "word" within a main "text string" S by employing the observation that when a mismatch occurs, the word itself embodies sufficient information to determine where the next match could begin, thus bypassing re-examination of previously matched characters.