chickadee » chicken » irregex » irregex-fold

(irregex-fold <irx> <kons> <knil> <str> [<finish> <start> <end>])procedure

This performs a fold operation over every non-overlapping place <irx> occurs in the string str.

The <kons> procedure takes the following signature:

(<kons> <from-index> <match> <seed>)

where <from-index> is the index from where we started searching (initially <start> and thereafter the end index of the last match), <match> is the resulting match-data object, and <seed> is the accumulated fold result starting with <knil>.

The rationale for providing the <from-index> (which is not provided in the SCSH regexp-fold utility), is because this information is useful (e.g. for extracting the unmatched portion of the string before the current match, as needed in irregex-replace/all), and not otherwise directly accessible.

Note when the pattern matches an empty string, to avoid an infinite loop we continue from one char after the end of the match (as opposed to the end in the normal case). The <from-index> passed to the subsequent \scheme{<kons>} or <finish> still refers to the original previous match end, however, so irregex-split and irregex-replace/all, etc. do the right thing.

The optional <finish> takes two arguments:

(<finish> <from-index> <seed>)

which simiarly allows you to pick up the unmatched tail of the string, and defaults to just returning the <seed>.

<start> and <end> are numeric indices letting you specify the boundaries of the string on which you want to fold.

To extract all instances of a match out of a string, you can use

(map irregex-match-substring
     (irregex-fold <irx>
                   (lambda (i m s) (cons m s))
		   '()
		   <str>
		   (lambda (i s) (reverse s))))

Note if an empty match is found <kons> will be called on that empty string, and to avoid an infinite loop matching will resume at the next char. It is up to the programmer to do something sensible with the skipped char in this case.