CSC 301.01, Class 40: Knuth-Morris-Pratt, concluded
- Notes and news
- Upcoming work
- Extra credit
- Review of Knuth-Morris-Pratt
- Backtracking multiple times
- Running time analysis
- Building the table
News / Etc.
- Add/Drop period has started. I’m letting CSC 322 over-enroll to 24.
- Homework 10 due Wednesday.
Extra Credit (Academic/Artistic)
- CS Table Tuesday: Esoteric PLs
Extra credit (Peer)
- Pub-free quiz, Wednesday
Extra Credit (Misc)
- Newtown film on Tuesday.
Other good things
- Musical this weekend.
- More music stuff.
swap, will be only be swapping neighboring characters?
Review of Knuth-Morris-Pratt
Inputs: target, a string source, a string P, a table that gives you the number of characters to preserve Steps: 01: t = 0; // Index into target 02: s = 0; // Index into source 03: while (t < length(target)) 04: if (s == length(source)) 05: return MATCH at t-s. 06: else if (target[t] == source[s]) 07: ++t; 08: ++s; 09: else if s == 0 10: ++t 11: else 12: s = P[s] 13: end if 14: end while
Backtracking multiple times
I had suggested that line 12 could be executed in sequential iterations of the loop. Can you come up with an example in which that happens in more than two iterations?
- table: 0: 0, 1: 0, 2: 0, 3: 0, 4: 2 Or
a b a b c 0 0 0 0 2
target1, we’re fine until we hit
position 4. At that point, we rewind s to 2, and compare
a in target, and move on. Only one backtrack.
a b a b c a b a b a b c * Fail Try here: a b a b c a b a b a b c * Match a b a b c a b a b a b c * Match a b a b c a b a b a b c * Match Done
target2, we’re fine until we hit
position 4. At that point, we rewrind
s to 2, and compare
at postition 2 in source to
d at position 4 in
causes another backtrack to position 0. The
a does not match
d, so we’re now on to line 10, and we advance in the target
a b a b a c a b a b d a b a b a c * MATCH a b a b a c a b a b d a b a b a c * MATCH a b a b a c a b a b d a b a b a c * MATCH a b a b a c a b a b d a b a b a c * MATCH a b a b a c a b a b d a b a b a c * FAIL, SHIFT a b a b a c a b a b d a b a b a c * FAIL, SHIFT a b a b a c a b a b d a b a b a c * FAIL, ADVANCE a b a b a c a b a b d a b a b a c *
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 a b a c a b a c a b a c a b a c d * 0 * 1 * 1 * 3 4 * * 1 * * * * 8
Running time analysis
- In cases one and two, we advance in the target string.
- In case three, we retreat in the source string. And we can do that repeatedly.
- How do we know that the algorithm is still O(m + build-table(n))?
- We’re going to amortize. We can only move backwards in the source string if we’ve moved forward in the source string.
- We only move forward in the source string when we move forward in the target string. (If we move forward in the source string, we move forward in the target string.)
- You can move forward in the target string at most m times.
- You can move forward in the source string at most m times.
- You can move backward in the source string at most m times.
- It’s a linear algorithm, plus the cost of building the table.
Building the table
We’ll build the table and use something like the algorithm above.
We will try to match the string to itself at each position, but do so efficiently.
How would you build the table? (Assume the length of source is
Here’s a not-quite-right solution.
01: P = * 02: p = -1 03: for s = 1 to n-1 04: // Find the first match of the character at s 05: while p > 0 and source[p+1] != source[s] 06: p = P[p] 07: if source[p+1] == source[s] 08: p = p+1 09: P[s] = p
Let’s talk about the final
In class, Wednesday morning of finals week.
- Exam will be four or five problems.
- Two pages of notes plus textbooks. Sam will bring
copies of the textbooks or printouts of the appropriate
- 8.5x11 inches or A4.
- No more than 1 mm thick.
- Closed computer.
- Mostly applying algorithms and techniques.
- E.g., Here’s a red-black tree. Remove the root or add an element.
- E.g., Here’s a loop invariant and some code, finish the code.
(Code will normally be in C, the most fill in adjective
- No memory leaks to worry about.
- Full semester!
- Yes, you need to know (or be able to read) every damn algorithm we’ve talked about.