- kmp-step pat rv c i c= p-startprocedure
This function encapsulates the work performed by one step of the KMP string search; it can be used to scan strings, input ports, or other on-line character sources for fixed strings.
PAT is the non-empty string specifying the text for which we are searching. RV is the Knuth-Morris-Pratt restart vector for the pattern, as constructed by make-kmp-restart-vector. The pattern begins at PAT[P-START], and is (string-length RV) characters long. C= is the character-equality function used to construct the restart vector, typically char=? or char-ci=?.
Suppose the pattern is N characters in length: PAT[P-START, P-START + N). We have already matched I characters: PAT[P-START, P-START + I). (P-START is typically zero.) C is the next character in the input stream. kmp-step returns the new I value -- that is, how much of the pattern we have matched, including character C. When I reaches N, the entire pattern has been matched.
Thus a typical search loop looks like this:
(let lp ((i 0)) (or (= i n) ; Win -- #t (and (not (end-of-stream)) ; Lose -- #f (lp (kmp-step pat rv (get-next-character) i char=? 0)))))
Example:
;; Read chars from IPORT until we find string PAT or hit EOF. (define (port-skip pat iport) (let* ((rv (make-kmp-restart-vector pat)) (patlen (string-length pat))) (let lp ((i 0) (nchars 0)) (if (= i patlen) nchars ; Win -- nchars skipped (let ((c (read-char iport))) (if (eof-object? c) c ; Fail -- EOF (lp (kmp-step pat rv c i char=? 0) ; Continue (+ nchars 1))))))))
This procedure could be defined as follows:
(define (kmp-step pat rv c i c= p-start) (let lp ((i i)) (if (c= c (string-ref pat (+ i p-start))) ; Match => (+ i 1) ; Done. (let ((i (vector-ref rv i))) ; Back up in PAT. (if (= i -1) 0 ; Can't back up more. (lp i))))))) ; Keep going.
Rationale: this procedure takes no optional arguments because it is intended as an inner-loop primitive and we do not want any run-time penalty for optional-argument parsing and defaulting, nor do we wish barriers to procedure integration/inlining.