Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Carry out the Rabin-Karp string matching algorithm to search for the pattern P =

ID: 3712145 • Letter: C

Question

Carry out the Rabin-Karp string matching algorithm to search for the pattern P = acae in the text T = bacaeadacabacdabeeda. You may assume that there are only 5 letters in the alphabet, and that the letters a through e map to the digits 0 through 4 in the obvious way. You can verify that d = 5, m = 4, and n = 20. Set q = 97. A. Compute p, the (decimal) value that corresponds to the pattern P. Show your complete solution. B. Compute t4, the (decimal) value that corresponds to the substring of T at shift 4. Show your complete solution.

Explanation / Answer

Pattern searching means that as per name suggested we are having with one particular sequence and need to find out some specific pattern how many times it occurs in that sequence.

Here we are having the sequence with indexes:

b

a

c

a

e

a

d

a

c

a

b

a

c

d

a

b

e

e

d

a

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

And the pattern is: “acae “

So what does the algorithm works now, the algorithm slides the pattern one by one and match the hash value with the sequence’s substring’s hash value.

Here length m = is the value for sequence

length n = is the value for pattern.

Now we will create a window of size n that will slide over the sequence and will compare the hash values.

Our algorithm will stop here. Hash at the next shift must be efficiently computable (O(1)) from the current hash value and the next character in text.

Now, d: number of characters in the alphabet

q : a prime number

h : d^(m-1)

So here given in our question, d= 5, q = 97

m = 4 (length of pattern) and n = 20 (length of text)

now we need find the value of h

= 5 ^(4-1) = 125

Lets find the value of windows where p will find the value of pattern and t will find the value of text

Now as per our question A) we need to compute for pattern i.e. “acae”

iteration 1: p = (d*p + pat[i]) % q               // pat[i] stores that specific index’s ASCII value

                      = (5*1 + 65 ) % 97     = 7

iteration 2:    =(5*2+ 67 ) % 97       = 8

iteration 3:     =(5*3 + 65) % 97      = 8

iteration 4:    =(5*4 + 69 ) % 97    = 9

So, The hash value of pattern p = 9

B) For t4 we are having the pattern “aead”

iteration 1 = (5*3 + 65) %97 = 8

iteration 2 = (5*4+ 69)%97 = 9

iteration 3 = (5*5+ 65)%97 = 9

iteration 4 =(5*6+ 68)%97 = 1

And the hash value of t4 is 1.

b

a

c

a

e

a

d

a

c

a

b

a

c

d

a

b

e

e

d

a

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19