Carry out the Rabin-Karp string matching algorithm to search for the pattern P =
ID: 3712145 • Letter: C
Question
Carry out the Rabin-Karp string matching algorithm to search for the pattern P = acae in the text T = bacaeadacabacdabeeda. You may assume that there are only 5 letters in the alphabet, and that the letters a through e map to the digits 0 through 4 in the obvious way. You can verify that d = 5, m = 4, and n = 20. Set q = 97. A. Compute p, the (decimal) value that corresponds to the pattern P. Show your complete solution. B. Compute t4, the (decimal) value that corresponds to the substring of T at shift 4. Show your complete solution.
Explanation / Answer
Pattern searching means that as per name suggested we are having with one particular sequence and need to find out some specific pattern how many times it occurs in that sequence.
Here we are having the sequence with indexes:
b
a
c
a
e
a
d
a
c
a
b
a
c
d
a
b
e
e
d
a
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
And the pattern is: “acae “
So what does the algorithm works now, the algorithm slides the pattern one by one and match the hash value with the sequence’s substring’s hash value.
Here length m = is the value for sequence
length n = is the value for pattern.
Now we will create a window of size n that will slide over the sequence and will compare the hash values.
Our algorithm will stop here. Hash at the next shift must be efficiently computable (O(1)) from the current hash value and the next character in text.
Now, d: number of characters in the alphabet
q : a prime number
h : d^(m-1)
So here given in our question, d= 5, q = 97
m = 4 (length of pattern) and n = 20 (length of text)
now we need find the value of h
= 5 ^(4-1) = 125
Lets find the value of windows where p will find the value of pattern and t will find the value of text
Now as per our question A) we need to compute for pattern i.e. “acae”
iteration 1: p = (d*p + pat[i]) % q // pat[i] stores that specific index’s ASCII value
= (5*1 + 65 ) % 97 = 7
iteration 2: =(5*2+ 67 ) % 97 = 8
iteration 3: =(5*3 + 65) % 97 = 8
iteration 4: =(5*4 + 69 ) % 97 = 9
So, The hash value of pattern p = 9
B) For t4 we are having the pattern “aead”
iteration 1 = (5*3 + 65) %97 = 8
iteration 2 = (5*4+ 69)%97 = 9
iteration 3 = (5*5+ 65)%97 = 9
iteration 4 =(5*6+ 68)%97 = 1
And the hash value of t4 is 1.
b
a
c
a
e
a
d
a
c
a
b
a
c
d
a
b
e
e
d
a
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19