Outdated egg!
This is an egg for CHICKEN 4, the unsupported old release. You're almost certainly looking for the CHICKEN 5 version of this egg, if it exists.
If it does not exist, there may be equivalent functionality provided by another egg; have a look at the egg index. Otherwise, please consider porting this egg to the current version of CHICKEN.
TOC »
Prcc(Parser/Regex Combinator library for Chicken scheme)
Introduction
Prcc is a PEG-like combinator parser library and inspired by Ruby gem rsec.
Each combinator is a procedure that accepts an opaque "context" object and returns an object representing its match, or #f if it does not match.
Combinators
- char CHARprocedure
Generate a parser that reads a char and returns this character as a string.
- <c> CHARprocedure
Alias of char.
- seq PARSER ...procedure
Sequence parser: each subparser must match, and their results are returned in a list.
- <and> PARSER ...procedure
Alias of sequence parser.
- sel PARSER ...procedure
Branch parser and ordered selected. Returns the result of the first parser that matches.
- <or> PARSER ...procedure
Alias of branch parser.
- one? PARSERprocedure
Appear 0 or 1 time. Returns the empty string if PARSER doesn't match.
- <?> PARSERprocedure
Alias of one?.
- rep PARSERprocedure
Repeat 0 to infinite times. Returns a list of PARSER results, with as many items as matches that were found.
Example:
(parse-string "aabba" (rep (sel (char #\a) (char #\b)))) => ("a" "a" "b" "b" "a")
- <*> PARSERprocedure
Alias of rep.
- rep+ PARSERprocedure
Repeat 1 to infinite times.
- <+> PARSERprocedure
Alias of rep+.
- pred PARSER0 PARSER1procedure
Lookahead predicate PARSER1.
Example:
(parse-string "a" (pred (char #\a) (eof))) => "a" ;; If we had used (seq), we would get '("a" "") ;; This also allows us to ensure this is the entire string: (parse-string "ab" (pred (char #\a) (eof))) => #f ;; Without the lookahead, it will simply consume as much as possible: (parse-string "ab" (char #\a)) => "a"
- <&> PARSER0 PARSER1procedure
Alias of pred
- pred! PARSER0 PARSER1procedure
Negative lookahead.
- <&!> PARSER0 PARSER1procedure
Alias of pred!.
- eofprocedure
End of file.
- (act PARSER [SUCC-PROC] [FAIL-PREC])procedure
Act on the result of the parser, whether it's success or failure.
This allows you to add semantic actions to the parser.
Note: Be sure not to return #f in SUCC-PROC, because that will be filtered out.
Example:
(define a-or-b (sel (char #\a) (char #\b))) (parse-string "aabba" (rep (act a-or-b (lambda (x) (if (string=? "a" x) 'yes 'no))))) => (yes yes no no yes)
- (<@> PARSER [SUCC-PROC] [FAIL-PREC])procedure
Alias of act.
- neg PARSERprocedure
Take parser failure as pass.
- <^> PARSERprocedure
Alias of neg.
- regexp-parser STRING #!optional CHUNK-SIZEprocedure
Generate a regexp parser.
- <r> STRING #!optional CHUNK-SIZEprocedure
Alias of regexp-parser.
- (lazy PARSER)syntax
Defer the binding of parser. This is useful for mutually recursive parsers, as PARSER can be defined after the use of the lazy parser.
Example:
;; Without "lazy" around bar, this would give an error that ;; bar is not yet defined. (define foo (sel (char #\x) (lazy bar))) (define bar (char #\y))
- cached PARSERprocedure
Cache parser result(packrat parsing).
Helpers
- str STRINGprocedure
A string parser.
- <s> STRINGprocedure
Alias of str.
- one-of STRINGprocedure
Parse one of chars in STRING.
- join+ PARSER0 PARSER1procedure
Repeat PARSER0 one or more times, interspersed by PARSER1.
Example:
;; Parse an array of "a" or "b" identifiers: ;; This can be done more elegantly with rep+_ (define ident (sel (char #\a) (char #\b))) (parse-string "[a,b,b,a]" (even (ind (seq (char #\[) (join+ ident (char #\,)) (char #\])) 1))) => ("a" "b" "b" "a")
- (join+_ PARSER0 PARSER1 [skip: PARSER2])procedure
Repeat PARSER0 with PARSER1 inserted but skip PARSER2. By default, PARSER2 is spaces parser (<s*>).
- ind SEQ-PARSER INDEXprocedure
Return the value of SEQ_PARSER output that is indicated by INDEX.
Example:
(parse-string "xy" (ind (seq (char #\x) (char #\y)) 1)) => "y"
- <#> SEQ-PARSER INDEXprocedure
Alias of ind.
- <w>procedure
A word letter (any uppercase or lowercase letter, digit or underscore, i.e. the same as (<r> "\\w")).
- <w*>procedure
Zero or more word letters.
- <w+>procedure
One or more word letters.
- <space>procedure
One whitespace character (space, tab or newline).
- <s*>procedure
Zero or more whitespace characters.
- <s+>procedure
One or more whitespace characters.
- (rep_ PARSER0 [skip: PARSER1])procedure
Repeat PARSER0 from 0 to infinite times, but skip PARSER1. By default, PARSER1 is spaces parser (<s*>).
- (<*_> PARSER0 [skip: PARSER1])procedure
Alias of rep_.
- (rep+_ PARSER0 [skip: PARSER1])procedure
Repeat PARSER0 from 1 to infinite times, but skip PARSER1. By default, PARSER1 is spaces parser (<s*>).
Example:
;; Parse an array of "a" or "b" identifiers: (define ident (sel (char #\a) (char #\b))) (parse-string "[a,b,b,a]" (ind (seq (char #\[) (rep+_ a-or-b skip: (char #\,)) (char #\])) 1)) => ("a" "b" "b" "a")
- (<+_> PARSER0 [skip: PARSER1])procedure
Alias of rep+_.
- (seq_ PARSER ... [skip: PARSER1])procedure
Sequence parser but skip PARSER1. By default, PARSER1 is spaces parser (<s*>).
- (and_ PARSER ... [skip: PARSER1])procedure
Alias of seq_.
- even SEQ-PARSERprocedure
Generate a parser which returns the elements at even-numbered positions of sequence parser output, collected in a list.
Note: This starts counting at zero!
Example:
(parse-string "abcde" (even (seq (char #\a) (char #\b) (char #\c) (char #\d) (char #\e)))) => ("a" "c" "e")
- odd SEQ-PARSERprocedure
Generate a parser which returns the elements at odd-numbered positions of sequence parser output, collected in a list.
Note: This starts counting at zero!
Example:
(parse-string "abcde" (odd (seq (char #\a) (char #\b) (char #\c) (char #\d) (char #\e)))) => ("b" "d")
- parse-file FILENAME PARSER #!optional CACHEprocedure
Parse a file with PARSER. By default, no cache (CACHE=#f).
- parse-string STRING PARSER #!optional CACHEprocedure
Parse a string with PARSER. By default, no cache (CACHE=#f).
- (parse-port PORT PARSER [CACHE])syntax
Parse from PORT with PARSER. By default, no cache (CACHE=#f).
Example
(use prcc) (define parser (<and> (<@> (<s> "hello") (lambda (o) "hello ")) (<s> "world") (eof))) (display (parse-string "helloworld" parser)) (newline)
More information
Packrat Parsing and Parsing Expression Grammars
Author
Wei Hu
License
Copyright (C) 2012, Wei Hu All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. Neither the name of the author nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Version History
- 0.1
- initial release