- (string-kmp-partial-search pat rv s i [c= p-start s-start s-end]) -> integerprocedure
Applies kmp-step across S; optional S-START/S-END bounds parameters restrict search to a substring of S. The pattern is (vector-length RV) characters long; optional P-START index indicates non-zero start of pattern in PAT.
Suppose PLEN = (vector-length RV) is the length of the pattern. I is an integer index into the pattern (that is, 0 <= I < PLEN) indicating how much of the pattern has already been matched. (This means the pattern must be non-empty -- PLEN > 0.)
- On success, returns -J, where J is the index in S bounding the end of the pattern -- e.g., a value that could be used as the END parameter in a call to substring/shared.
- On continue, returns the current search state I' (an index into RV) when the search reached the end of the string. This is a non-negative integer.
Hence:
- A negative return value indicates success, and says where in the string the match occurred.
- A non-negative return value provides the I to use for continued search in a following string.
This utility is designed to allow searching for occurrences of a fixed string that might extend across multiple buffers of text. This is why, for example, we do not provide the index of the start of the match on success -- it may have occurred in a previous buffer.
To search a character sequence that arrives in "chunks," write a loop of this form:
(let lp ((i 0)) (and (not (end-of-data?)) ; Lose -- return #f. (let* ((buf (get-next-chunk)) ; Get or fill up the buffer. (i (string-kmp-partial-search pat rv buf i))) (if (< i 0) (- i) ; Win -- return end index. (lp i))))) ; Keep looking.
Modulo start/end optional-argument parsing, this procedure could be defined as follows:
(define (string-kmp-partial-search pat rv s i c= p-start s-start s-end) (let ((patlen (vector-length rv))) (let lp ((si s-start) ; An index into S. (vi i)) ; An index into RV. (cond ((= vi patlen) (- si)) ; Win. ((= si end) vi) ; Ran off the end. (else (lp (+ si 1) ; Match s[si] & loop. (kmp-step pat rv (string-ref s si) vi c= p-start)))))))